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We determine the top quark mass m t using tt pairs produced in the D0 detector by y^s —1.8 TeV 
pp collisions in a 125 pb _1 exposure at the Fermilab Tevatron. We make a two constraint fit to 
mt in it — > bW + bW~ final states with one W boson decaying to qq and the other to ev or [w. 
Likelihood fits to the data yield m t (l + jets) = 173.3 ± 5.6 (stat) ± 5.5 (syst) GeV/c 2 . When this 
result is combined with an analysis of events in which both W bosons decay into leptons, we obtain 
m t = 172.1 ± 5.2 (stat) ± 4.9 (syst) GeV/c 2 . An alternate analysis, using three constraint fits to 
fixed top quark masses, gives mt(Z + jets) = 176.0 ± 7.9 (stat) ± 4.8 (syst) GeV/c 2 , consistent with 
the above result. Studies of kinematic distributions of the top quark candidates are also presented. 

PACS numbers: 14.65.Ha, 13.85. Qk, 13.85. Ni 
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The discovery of the top quark by the CDF Q and 
D0 collaborations at the Fermilab Tevatron ended the 
search phase of top quark physics. Since then, emphasis 
has shifted to determining its properties — especially its 
large mass (about 200 times that of a proton) and pro- 
duction cross section. Reviews of searches for and the 
initial observations of the top quark are given in Ref. || . 
Details of the initial D0 top quark search can be found in 
Ref. ||. This paper reports on the determination of the 
top quark mass using all the data collected by the D0 
experiment during the 1992-1996 Tevatron runs. This is 
more than twice as much data as was available for the 
initial observation. In addition, improvements have been 
made in event selection, object reconstruction, and mass 
analysis techniques. The result is a reduction of the sta- 
tistical and systematic errors by nearly a factor of four. 
A short paper giving results from this analysis has been 
published [§. 

The top quark is one of the fundamental fermions in 
the standard model of electroweak interactions and is 
the weak-isospin partner of the bottom quark. For a 
top quark with mass substantially greater than that of 
the W boson, the standard model predicts it to decay 
promptly (before hadronization) to a IF boson plus a 
bottom quark with a branching fraction of nearly 100%. 
A precision measurement of the top quark mass, along 
with the W boson mass and other electroweak data, can 
set constraints on the mass of the standard model Higgs 
boson. It may also be helpful in understanding the origin 
of quark masses. 

In pp collisions at a 1.8 TeV center of mass energy, 
top quarks are produced primarily as tt pairs. Each de- 
cays into a W boson plus a bottom quark, resulting in 
events having several jets and often a charged lepton. 
Due to the large top quark mass, these final state ob- 
jects tend to have large momenta transverse to the pp 
direction. About 30% of tt decays have a single electron 
or muon (from the decay of one of the W bosons) with 
a large transverse momentum. Typically, the neutrino 
that accompanies this electron or muon will also have a 
large transverse momentum, producing significant miss- 
ing transverse energy. These characteristics allow for the 
selection of a sample of "lepton + jets" events with an 
enriched signal to background ratio. This sample is the 
basis for the top quark mass analysis reported in this 
paper. It also comprises a large portion of the data sam- 
ple used for the measurement of the pp — > tt production 
cross section A similar mass analysis for the final 
state with two charged leptons plus jets is described in 
Ref. @. " 

Three methods have been used to determine the 
top quark mass in the lepton + jets channels. Two of 
them use constrained variable-mass kinematic fits to ob- 
tain a best-fit mass value for each event. The top quark 
mass is then extracted using a maximum likelihood fit to 
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a two-dimensional distribution, with one axis being the 
best-fit mass, and the other being a variable which dis- 
criminates it events from the expected backgrounds. The 
difference between these two methods is in the discrim- 
inant variable and the binning used. The third method 
uses x 2 values from fixed-mass kinematic fits. A cut is 
made using a top quark discriminant to select a sample 
of events with low background. The expected contribu- 
tion from the background is subtracted from the distri- 
bution of \ 2 versus mass, and the resulting background- 
subtracted distribution is fit near the minimum to extract 
the top quark mass. 

This paper is organized as follows. Section [n] briefly 
describes aspects of the D0 detector essential for this 
analysis. Section III discusses event selection, including 
triggers, particle identification, and the criteria used to 
select the initial event sample. Section IV describes the 
jet energy corrections. Section ^ discusses the simulation 



Muon Chambers 



of tt signal and background events. Section VI defines 
the two discriminants used t o se parate top quark events 
from background. Section VII describes the variable- 



mass kinematic fits to individual events and the likeli- 
hood fits used to extract the top quark mass, and gives 
results from these fits. Section VIII describes the pseudo- 



likelihood method (which uses fixed-mass kinematic fits) , 
gives results from it, and compares these results with 
those from the two likelihood methods. Section IX ex- 



amines some kinematic properties of top quark events. 
Finally, conclusions are presented in Sec. [xj. 



II. THE D0 DETECTOR 

D0 is a multipurpose detector designed to study pp 
collisions at the Fermilab Tevatron Collider. The detec- 
tor was commissioned during the summer of 1992. The 
work presented here is based on approximately 125 pb _1 
of accumulated data recorded during the 1992-1996 col- 
lider runs. A full description of the detector may be 
found in Ref. Here, we describe briefly the properties 
of the detector that are relevant for the top quark mass 
measurement. 

The detector was designed to have good electron and 
muon identification capabilities, and to measure jets and 
missing transverse energy $ T with good resolution. The 
detector consists of three major systems: a nonmagnetic 
central tracking system, a hermetic uranium liquid-argon 
calorimeter, and a muon spectrometer. A cut away view 
of the detector is shown in Fig. [l]. 

The central detector (CD) consists of four tracking 
subsystems: a vertex drift chamber, a transition radi- 
ation detector (not used for this analysis), a central drift 
chamber, and two forward drift chambers. It measures 
the trajectories of charged particles and can discriminate 
between single charged particles and e + e~ pairs from 
photon conversions by measuring the ionization along 
their tracks. It covers the region \r/\ < 3.2 in pseudo- 




Calorimeters 



Tracking Chambers 



FIG. 1. Cut away isometric view of the D0 detector. 



rapidity, where rj = tanh -1 (cos(9). (We define 9 and <fi to 
be the polar and azimuthal angles, respectively.) 

The calorimeter is divided into three parts: the cen- 
tral calorimeter (CC) and the two end calorimeters (EC), 
which together cover the pseudorapidity range \r/\ < 4.2. 
The inner electromagnetic (EM) portion of the calorime- 
ters is 21 radiation lengths deep, and is divided into four 
longitudinal segments (layers). The outer hadronic por- 
tions are 7-9 nuclear interaction lengths deep, and are di- 
vided into four (CC) or five (EC) layers. The calorimeters 
are transversely segmented into pseudoprojective towers 
with A77 x A(j) = 0.1 x 0.1. The third layer of the elec- 
tromagnetic (EM) calorimeter, in which the maximum of 
EM showers is expected, is segmented twice as finely in 
both r\ and <f), with cells of size Aij x A<p = 0.05 x 0.05. 

Since muons from top quark decays populate predomi- 
nantly the central region, this work uses only the central 
portion of the D0 muon system, covering |r/| < 1.7. This 
system consists of four planes of proportional drift tubes 
in front of magnetized iron toroids with a magnetic field 
of 1.9 T and two groups of three planes each of propor- 
tional drift tubes behind the toroids. The magnetic field 
lines and the wires in the drift tubes are oriented trans- 
versely to the beam direction. The muon momentum 
pV- is measured from the muon's deflection angle in the 
magnetic field of the toroid. 

A separate synchrotron, the Main Ring, lies above the 
Tevatron and passes through the outer region of the D0 
calorimeter. During data-taking, it is used to acceler- 
ate protons for antiproton production. Losses from the 
Main Ring may deposit energy in the calorimeters, in- 
creasing the instrumental background. We reject much 
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of this background at the trigger level by not accepting 
triggers during injection into the Main Ring, when losses 
are large. Some triggers are also disabled whenever a 
Main Ring bunch passes through the detector or when 
losses are registered in scintillation counters around the 
Main Ring. 



where £tot(0.4) is the energy within AR < 0.4 of the 
cluster centroid (AR = \J At} 2 + A0 2 ) and E EM (0.2) is 
the energy in the EM calorimeter within AR < 0.2. 



2. Muons 



III. EVENT SELECTION 

For the purposes of this analysis, we divide the lcpton 
+ jets final states into electron and muon channels. We 
further subdivide these channels based on whether or not 
a muon consistent with b — > fx + X is present. We thus 
have four channels, which will be denoted e+jets, /i+jets, 
e + jets//z, and [i + jets///. 

The event sample used for determining the top quark 
mass is selected using criteria similar to those used for 
the it production cross section measurement ||, with 
the exception of the cuts on the event shape variables 
Ht = X) Ej! and aplanarity. The particle identification, 
trigger requirements, and event selection cuts are sum- 
marized below. More detailed information about trigger- 
ing, particle identification, and jet and $ T reconstruction 
may be found in Ref. (Note, however, that the cur- 
rent electron and muon identification algorithms provide 
better rejection of backgrounds and increased efficiencies 
than those used in Ref. 01.) 



A. Particle identification 

1. Electrons 

Electron identification is based on a likelihood tech- 
nique. Candidates are first identified by finding isolated 
clusters of energy in the EM calorimeter with a matching 
track in the central detector. We then cut on a likelihood 
constructed from the following four variables: 

• The x" 2 from a covariance matrix which measures 
the consistency of the calorimeter cluster shape 
with that of an electron shower. 

• The electromagnetic energy fraction, defined as the 
ratio of the portion of the energy of the cluster 
found in the EM calorimeter to its total energy. 

• A measure of the consistency between the track po- 
sition and the cluster centroid. 

• The ionization dE/dx along the track. 

To a good approximation, these four variables are inde- 
pendent of each other for electron candidates. 

Electrons from W boson decay tend to be isolated, 
even in it events. Thus, we make the additional cut 



Two types of muon selection are used in this analysis. 
The first is used to identify isolated muons from W — ► \iv 
decay. The other is used to tag &-jets by identifying "tag" 
muons consistent with originating from b — > fi + X decay. 

Besides cuts on the muon track quality, both selections 
require that: 

• The muon pseudorapidity \rj ll \ < 1.7. 

• The magnetic field integral > 2.0 T • m (equivalent 
to a momentum change of 0.6 GeV/c). 

• The energy deposited in the calorimeter along a 
muon track be at least that expected from a mini- 
mum ionizing particle. 

For isolated muons, we apply the following additional 
selection requirements: 

• Transverse momentum pt > 20 GeV/c. 

• The distance in the r\ — (j> plane between the muon 
and the closest jet AR(fi,j) > 0.5. 

For tag muons, we instead require: 

• p T > 4 GeV/c. 

• AR([i,j) <0.5. 



3. Jets and missing Et 

Jets are reconstructed in the calorimeter using a fixed- 
size cone algorithm. We use a cone size of AR = 0.5. 

Neutrinos are not detected directly. Instead, their 
presence is inferred from missing transverse energy lft T . 
Two different definitions of $ T are used in the event se- 
lection: 

• $^ al , the calorimeter missing Et, obtained from the 
transverse energy of all calorimeter cells. 

• $ T , the muon corrected missing Et, obtained by 
subtracting the transverse momenta of identified 
muons from 



E tot (0A) - E EM (0.2) 
£ EM (0.2) 



< 0.1, 



(3.1) 
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B. Triggers 

The D0 trigger system is responsible for reducing the 
event rate from the beam crossing rate of 286 kHz to the 
approximately 3-4 Hz which can be recorded on tape. 
The first stage of the trigger (level 1) makes fast ana- 
log sums of the transverse energies in calorimeter trigger 
towers. These towers have a size of Arj x Acj> = 0.2 x 0.2 
and are segmented longitudinally into electromagnetic 
and hadronic sections. The level 1 trigger operates on 
these sums along with patterns of hits in the muon spec- 
trometer. It can make a trigger decision within the space 
of a single beam crossing (unless a level 1.5 decision is 
required; see below). After level 1 accepts an event, the 
complete event is digitized and sent to the level 2 trigger, 
which consists of a farm of 48 general-purpose processors. 
Software filters running in these processors make the final 
trigger decision. 

The triggers used are defined in terms of combinations 
of specific objects (electron, muon, jet, $ T ) required in 
the level 1 and level 2 triggers. These elements are sum- 
marized below. For more information on the D0 trigger 
system, see Refs. [QB]. 

To trigger on electrons, level 1 requires that the trans- 
verse energy in the EM section of a trigger tower be above 
a programmed threshold. The level 2 electron algorithm 
examines the regions around the level 1 towers which are 
above threshold, and uses the full segmentation of the 
EM calorimeter to identify showers with shapes consis- 
tent with those of electrons. The level 2 algorithm can 
also apply an isolation requirement or demand that there 
be an associated track in the central detector. 

For the latter portion of the run, a "level 1.5" processor 
was also available for electron triggering. The Et of each 
EM trigger tower above the level 1 threshold is summed 
with the neighboring tower with the most energy. A cut is 
then made on this sum. The hadronic portions of the two 
towers are also summed, and the ratio of EM transverse 
energy to total transverse energy in the two towers is 
required to be above 0.85. The use of a level 1.5 electron 
trigger is indicated in the tables below as an "EX" tower. 

The level 1 muon trigger uses the pattern of drift tubes 
with hits to provide the number of muon candidates in 
different regions of the muon spectrometer. A level 1.5 
processor may optionally be used to put a px requirement 
on the candidates (at the expense of slightly increased 
dead time). In level 2, the full digitized data are avail- 
able, and the first stage of the full event reconstruction 
is performed. The level 2 muon algorithm can optionally 
require the presence of an energy deposit in the calorime- 
ter consistent with that from a muon; this is indicated in 
the tables below by "cal confirm" . 

For a jet trigger, level 1 requires that the sum of the 
transverse energies in the EM and hadronic sections of a 
trigger tower be above a programmed threshold. Alter- 
natively, level 1 can sum the transverse energies within 
"large tiles" of size 0.8 x 1.6 in 77 x and cut on these 



sums. Level 2 then sums calorimeter cells around the 
identified towers (or around the E^-weighted centroids 
of the large tiles) in cones of a specified radius AR, and 
imposes a cut on the total transverse energy. 

The Ifirp in the calorimeter can also be computed in 
both level 1 and level 2. The z position used for the in- 
teraction vertex in level 2 is determined from the relative 
timing of hits in scintillation counters located in front of 
each EC (level 0). 

The trigger requirements used for this analysis are 
summarized in Tables [j 111 . These tables are divided ac- 
cording to the three major running periods. Run la was 
from 1992-1993, run lb was from 1994-1995, and run lc 
was during the winter of 1995-1996. Note that not all the 
triggers listed were active simultaneously, and that dif- 
fering requirements were used to veto possible Main Ring 
events. In addition, some of the triggers were prescaled 
at high luminosity. The "exposure" column in the tables 
takes these factors into account. 



C. Event selection 

The first set of cuts used to define the sample for mass 
analysis is very similar to that used for the cross section 
analysis |J: 

• An isolated electron or muon with Et > 20 GeV. 

• \?f\ < 2.0 or 1^1 < 1.7. 

• At least 4 jets with E T > 15 GeV and |?y jct | < 2.0. 

• iff 1 > 25 GeV for e+jets (untagged) or iff 1 > 
20 GeV for ^i+jets (both tagged and untagged). 

• Ifjrp > 20 GeV. 

We reject events which contain photons — isolated clus- 
ters in the EM calorimeter with shapes consistent with 
an EM shower and with a poor match to any track in 
the central detector, and satisfying Et > 15 GeV and 
\rj\ < 2. Three such events are rejected. We also reject 
events which contain extra isolated high-pT electrons or 
which fail additional cuts to remove calorimeter noise and 
Main Ring effects. 

After these cuts, the remaining background is primar- 
ily W + jets, with a small (~ 20%) admixture of QCD 
multijet events in which a jet is misidentified as a lepton. 

If a candidate has a tag muon, we require it to pass 
additional cuts on the direction of the $ T vector. For the 
e + jets//x channel, we require 

• $ T > 35 GeV, if A^(# T ,/i) < 25°, 

while for the /i + jets/ fi channel, we require that the 
highest-j»T muon satisfy 

• A(t>{$ T ,n) < 170° and 

• |A</>(#t>m) ~ 90° |/90° < # T /(45 GeV). 
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TABLE I. Triggers used during run la (1992-1993). "Exposure" gives the effective integrated luminosity for each trigger, 
taking into account any prescaling. 



Name 


Exposure 
(Pb- 1 ) 


Level 1 


Level 2 


Used by 


ELE-HIGH 


11.0 


1 EM tower, E T > 10 GcV 


1 isolated e, E T > 20 GeV 


e + jets 
e + jcts//x 


ELE-JET 


14.4 


1 EM tower, E T > 10 GcV, \t]\ < 2.6 
2 jet towers, E T > 5 GeV 


1 e, E T > 15 GeV, \rj\ < 2.5 
2 jets (AR = 0.3), E T > 10 GeV, \r)\ < 2.5 
#5?' > 10 GeV 


e -f- jets 
e + jcts//x 


MU- JET-HIGH 


10.2 


1 P, M < 2.4 
1 jet tower, E T > 5 GcV 


1 fi, p T > 8 GeV/c 
1 jet (AR = 0.7), E T > 15 GcV 


p 4- jets 
p, + jets/ ^ 



TABLE IL Same as Table § for run lb (1994-1995). 



Name 


Exposure 
( Pb" 1 ) 


Level 1 


Level 2 


Used by 


EMl-EISTRKCC-MS 


93.4 


1 EM tower, E T > 10 GcV 
1 EX tower, E T > 15 GeV a 


1 isolated e w/track, B T > 20 GcV 
> 15 GcV 


e + jets 
e + jcts//^ 


ELE- JET-HIGH 


98.0 


1 EM tower, E T > 12 GcV, \t}\ < 2.6 

2 jet towers, E T > 5 GeV, \tj\ < 2.0 


1 e, £ T > 15 GeV, r; < 2.5 
2 jets (AR = 0.3), E T > 10 GeV, M < 2.5 
#5?' > 14 GcV 


e + jets 
e + jcts//^ 


MU- JET-HIGH 


66.4 


1 p, Pt > 7 GcV/c", \ri\ < 1.7 
1 jet tower, E T > 5 GeV, |tj| < 2.0 a 


1 H,p T > 10 GeV/c, |?)| < 1.7 
1 jet (AR = 0.7), E T > 15 GeV, \n\ < 2.5 


p 4- jets 
(Li + jets/ ^ 


MU-JET-OAL 


88.0 


1 p, p T > 7 GcV/c a , < 1.7 
1 jet tower, E T > 5 GeV, |tj| < 2.0 a 


1 p. Pt > 10 GeV/c, \r]\ < 1.7, cal confirm 
1 jet (AR = 0.7), E T > 15 GeV, \t)\ < 2.5 


4- jets 
(Li + jets/ ^ 


MU- JET-CENT 


48.5 


1 M, hi < 1-0 
1 jet tower, _E T > 5 GeV, \rj\ < 2.0 


1 (i, p T > 10 GeV/c, < 1.0 
1 jet (AR = 0.7), E T > 15 GeV, |?)| < 2.5 


/i + jets 
+ jets/ p 


MU-JET-CENCAL 


51.2 


1 M < 1-0 

1 jet tower, E T > 5 GeV, < 2.0 


1 p, pt > 10 GeV/c, \rj\ < 1.0, cal confirm 
1 jet (AR = 0.7), E T > 15 GcV, \n\ < 2.5 


/i + jets 
(Li + jets/ ^ 


JET-3-MU 


11.9 


3 jet towers, E T > 5 GeV 
#5?' > 20 GcV 


3 jets (AR = 0.7), E T > 15 GcV, |?;| < 2.5 
$j, al > 17 GcV 


(Li + jets 
(Li + jets/ p 


JET-3- MISS-LOW 


57.8 


3 large tiles, E T > 15, |r/| < 2.4 
3 jet towers, E T > 7 GcV, H < 2.6 


3 jets (AR = 0.5), E T > 15 GcV, < 2.5 
> 17 GcV 


p + jets 
(Li + jets/ p 


JET-3-L2MU 


25.8 


3 large tiles, E T > 15, |77| < 2.4 
3 jet towers, E T > 7 GeV, \tj\ < 2.6 


1 p, pt > 6 GeV/c, < 1.7, cal confirm 
3 jets (AR = 0.5), E T > 15 GeV, \tj\ < 2.5 
^j? 1 > 17 GcV 


p + jets 
f.i + jets/ ^ 



a This cut was looser than indicated during early portions of the run. 



TABLE III. Same as Table § for run lc (1995-1996). 



Name 


Exposure 
(Pb" 1 ) 


Level 1 


Level 2 


Used by 


ELE-JET-HIGH 


1.9 


1 EM tower, E T > 12 GcV, |?;| < 2.6 

2 jet towers, E T > 5 GeV, |r)| < 2.0 


1 e, E T > 15 GeV, M < 2.5 
2 jets (AR = 0.3), E T > 10 GeV, |r;| < 2.5 
0f > 14 GcV 


e + jets 
e + jcts//x 


ELE-JET- HIGH A 


11.0 


1 EM tower, E T > 12 GeV, \t]\ < 2.6 

2 jet towers, E T > 5 GeV, |r;| < 2.0 

1 EX tower, E T > 15 GcV 


1 e, E T > 17 GcV, M < 2.5 
2 jets (AR = 0.3), E T > 10 GcV, \r/\ < 2.5 
#5?' > 14 GcV 


e + jets 
e + jcts/^i 


MU-.JET-CENT 


8.9 


1 P, hi < 1-0 
1 jet tower, E T > 5 GeV, \tj\ < 2.0 
2 jet towers, E T > 3 GcV 


1 At, p T > 12 GeV/c, |))| < 1.0 
1 jet (AR = 0.7), E T > 15 GcV, |r/| < 2.5 


4- jets 
(Li + jets/ ^ 


MU-JET-CENCAL 


11.4 


1 M, M < 1-0 
1 jet tower, E T > 5 GcV, |r/| < 2.0 
2 jet towers, E T > 3 GeV 


1 p, pt > 12 GeV/c, \rj\ < 1.0, cal confirm 
1 jet (AR = 0.7), E T > 15 GcV, < 2.5 


(Li + jets 
M + jets//i 


JET-3-L2MU 


11.3 


3 large tiles, E T > 15, \r)\ < 2.4 
3 jet towers, E T > 5 GeV, \r]\ < 2.0 
4 jet towers, E T > 3 GcV 


1 /i, pt > 8 GcV/c, ?y < 1.7, cal confirm 
3 jets (AR = 0.5), E T > 15 GcV, |rj| < 2.5 
f?T l > 17 GcV 


p 4- jets 
(Li + jets/ ^ 
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These cuts remove QCD multijet background events 
which appear to have a large ]fi T due to a mismeasure- 
ment of the muon momentum. 

For the remaining, untagged, events, we require: 



• E, 

• \v 



W EE |^ P | 

< 2.0. 



\$ T \ > 60 GeV. 



For the purpose of these two cuts, we define r/ w by as- 
suming that the entire $ T of the event is due to the 
neutrino from the decay of the W boson. The longitu- 
dinal component of the neutrino momentum p v z is found 
by using the W boson mass Mw as a constraint. If the 
transverse mass of the lepton and neutrino M?{lv) is less 
than Mw, there are two real solutions; the one with the 
smallest absolute value of p" z is used. Monte Carlo stud- 
ies show that this is the correct solution about 80% of the 
time. If Mt(Iv) > Mw there are no real solutions. In 
this case, the $J T is scaled so that Mt(Iv) = Mw- This 
scaled $ T is also used for the E)p cut (but not for the 
previous cuts on $ T alone). 

This cut on E^ removes a portion of the QCD multi- 
jet background. Figure || compares the EY^ distribution 
for this background to that from Monte Carlo W + jets 
events. 

We show in Fig. || the distributions of \-q w | for our data 
and for the Monte Carlo prediction. The data are seen to 
significantly exceed the prediction of the vecbos Monte 
Carlo (described in Sec. |v|) in the far forward region. The 
amount of it signal with \t) \ > 2 is only a few percent 
(« 3% for m t — 175 GeV/c 2 ). In addition, a check of the 
W boson transverse mass and ]$ T distributions shows 
that the QCD multijet background plays no unusually 
prominent role at high \rj w \. We note that the vecbos 
Monte Carlo, while the best currently available, is only a 
tree-level calculation of the VK+jets process. Particularly 
in the forward direction, one would expect higher order 
corrections to play a larger role. To mitigate the effects of 
this discrepancy, and to further reduce the background, 
we require \rj w \ < 2. Once this cut is made, the x| 
between the data and prediction is 12.2 for 7 d.o.f., giving 
a 9% probability. ( X | = 2 £. [ Vi - N t + ln(JV*/|fc)], 
where N is the number of observed events and y is the 
total number expected from Monte Carlo. This form is 
appropriate for low statistics ||.) The contribution of 
this effect t o the systematic error will be discussed in 
Sec. VUG 2 (and is found to be negligible). 

These event selection cuts are summarized in Ta- 
ble IV. When applied to the approximately 125 pb _1 



of data from the 1992-1996 collider runs, 91 events are 
selected [|l0]], seven of which have a tag muon. This sam- 
ple will be referred to as the "precut" sample, and the 
set of cuts as the "PR" cuts. One additional cut is made 
to define the final sample. This is based on the \ 2 °f a 
kinematic fit to t he it decay hypothesis (x 2 < 10), and is 
described in Sec. VII. This final cut reduces the sample 
to 77 candidate events, of which five are tagged. 
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FIG. 2. distribution for Monte Carlo VK+jets events 

(solid histogram) and for QCD multijet background data 
(dashed histogram). All selection cuts are applied except for 
the cut. The arrow shows the cut value. (The normal- 
izations are take n from the result of the LB fit to the data, as 
describ ed in S ec. [VTI El with channels combined as described 
in Sec. VII D[ The models used to simulate the data are de- 
scribed in Sec. (v|.) 
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FIG. 3. j?7 W j distribution for data (histogram), predicted 
signal plus background (filled circles), and background alone 
(open triangles). All selection cuts are applied except for the 
r] W cut. The arrow shows the cut value. (The normalizations 
are as in Fig. |^.) 
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TABLE IV. Summary of event selection cuts. 



("!i"l£lT"ITlpl 

Li.CX'1 ll±l_*± 


p -A- ip1~q 


// -1- ipf^ 

t.L | 1 1... L. o 


p -\- i pt q / / / 


/ / -1- 1 pt Q / / / 


T OTlf /"ATI 

J_jt;JJ LUll 


R e -> 9fl HpV 


prp ZlU VJC V j O 


K e -> 9n npv 

Hj'T' ^ £AJ VJt V 






\rf\ < 2 


r/ M < 1.7 


|?? e < 2 


r/ M < 1.7 


TP 


$ T > 20 GeV 


# T > 20 GeV 


$ T > 20 GeV 


$ T > 20 GeV 




SP* 1 -> 95 npv 


-> 90 npv 

Ip'j' vie V 




it? 31 ~> 90 npv 


Jets 


> 4 jets 


> 4 jets 


> 4 jets 


> 4 jets 




> 15 GeV 


> 15 GeV 


JSSf > 15 GeV 


E&f 4 > 15 GeV 




Ir7 jct l < 2 


|/? jct l < 2 


|77 jot | < 2.0 


|r, jct l < 2 


M Tag 


No tag 


No tag 


Tag required 


Tag required 


Other 


£^ > 60 GeV 


E% > 60 GeV 


$ T > 35 GeV 


A0(# T ,/x) < 170° 




\n w \ <2.o 


\r, w \ < 2.0 


if A<f>(# T ,fj,) < 25° 


\A<j>{$ T ,n) -90° |/90° < 










# T /(45 GeV) 


Events passing cuts 


43 


41 


4 


3 


With x 2 < 10 


35 


37 


2 


3 



IV. JET CORRECTIONS AND ENERGY SCALE 
ERROR 

To calibrate the energy scale so that data and Monte 
Carlo (MC) are on an equal footing, we apply a series 
of energy corrections to the measured objects. These 
corrections are carried out in three steps. The first of 
these corrections is done before events are selected and 
is used by most D0 analyses; the other two corrections 
are applied during the kinematic fit and are specific to 
the top quark mass analysis. 

A. Standard corrections 



TABLE V. Parameters for parton-level jet corrections. 
.E(corrected) = (E - A)/B. 



Light quark jets Untagged b jets 



r\ region 




A (GeV) 


B 


A (GeV) 


B 


0.0 < 


l^det| 


< 0.2 


0.322 


0.933 


-0.672 


0.907 


0.2 < 


l»?det| 


< 0.6 


0.635 


0.930 


-1.34 


0.914 


0.6 < 


»7det 


< 0.9 


1.86 


0.883 


0.002 


0.868 


0.9 < 


|r?det| 


< 1.3 


1.70 


0.933 


-0.548 


0.904 


1.3 < 


l?7det| 




4.50 


0.882 


2.46 


0.859 



(Ai? = 0.5) in the central calorimeter. Further details 
about these corrections may be found in Ref. PJ. 



For the standard corrections, electromagnetic objects 
are first scaled by a factor which was chosen to make 
the invariant mass peak from dielectron events match 
the Z boson mass as measured by the LEP experiments. 
(This factor is determined separately for each of the three 
cryostats of the calorimeter.) Next, jet energies are cor- 
rected using 

_,. ^(measured) — O . . 

^(corrected) = — _ ^ . (4.1) 

Here, R is the calorimeter response; it is found using Et 
balance (as determined from the total $ T ) in 7 + jets 
events. This determination is done separately and sym- 
metrically for both data and Monte Carlo. O is the offset 
due to the underlying event, multiple interactions, and 
noise from the natural radioactivity of the uranium ab- 
sorber. It is determined by comparing data in which a 
hard interaction is required to data in which that require- 
ment is relaxed, and by comparing data taken at differ- 
ent luminosities. The term S is the fractional shower 
leakage outside the jet cone in the calorimeter. It is de- 
termined by using single particle showers measured in 
the test beam to construct simulated showers from MC 
jets; this leakage is approximately 3% for a 50 GeV jet 



B. Parton-level corrections 

The procedure of the previous section corrects for the 
portions of showers in the calorimeter which spread out- 
side of the jet cone, but not for any radiation outside of 
the cone. Thus, the corrected jet energies are systemat- 
ically lower than the corresponding parton-level energies 
(i.e., before QCD evolution or fragmentation in the MC). 
We make a correction to match the scale of the jet ener- 
gies to that of the unfragmented partons in the MC. 

To derive this correction, we use HERWIG |Q tt Monte 
Carlo and match reconstructed jets to the partons from 
top quark decay. Their energies are then plotted against 
each other, as in Fig. ||. This relation is observed to be 
nearly linear. We fit it separately for light quark jets 
and for untagged b quark jets. The results are given in 
Table |v| for different regions in rjdet {Vdet = 'detector-?]' 
= the pseudorapidity corresponding to a particle coming 
from the geometric center of the detector, rather than 
from the interaction vertex). Separating the b quark 
jets allows us to correct, on average, for the neutrinos 
from b decays. This correction is observed not to depend 
strongly on the MC top quark mass. 

For tagged b quark jets, we have additional information 
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FIG. 4. The measured jet energies for quarks from W — > qq 
in tt MC are plotted against the corresponding parton ener- 
gies. Radiation outside of the jet cone causes the measured 
jet energy to be lower than the energy at the parton level. 
The dashed line is drawn along the diagonal, and the solid 
line is a linear fit to the points. This plot is based on herwig 
fragmentation with |?7^1 t | < 0.2. 




25 50 75 

Measured p 1 " (GeV/c) 



FIG. 5. Correlation between the measured momentum 
and the true momentum of the tag muon in Monte Carlo 
tt events. The curve is the result of an empirical fit, 
47.19[1 - exp(-0.03398 - 0.01593p" - 0.0005554(p M ) 2 )]. 



from the tag muon. However, the momentum spectrum 
of muons from b quark decay in it events is rather steeply 
falling; furthermore, the resolution of the muon system is 
more nearly Gaussian in the inverse momentum 1/p than 
in p. Thus, measurement errors will cause the measured 
momentum of a tag muon to be biased upwards. We 
correct for this bias using it MC, as illustrated in Fig. ||. 
We then further scale the muon momentum to account 
for the unobserved neutrino, as shown in Fig. ^. The jet 
itself is corrected using the light quark corrections; the 
estimated leptonic energy is then added to this corrected 
jet energy. 

C. ry-dependent adjustment and energy scale error 

For the final corrections, we study the response of the 
detector to 7 + 1 jet events, using both data and Monte 
Carlo. We select events containing exactly one photon 
with E% > 20 GeV, |^ et | < 1.0 or 1.6 < \r/j ct \ < 2.5, and 
exactly one reconstructed jet of any energy (excluding the 
photon). We require that the jet satisfy Et > 15 GeV, 
1 77 1 < 2, and \ir — Acj)(j,j)\ < 0.2 rad. We reject events 
with Main Ring activity and those which are likely to 
be multiple interactions. To reject W boson decays, we 
further require that $ T /E^ < 1.2 if E^ < 25 GeV, or 
Jp T /E^, < 0.65 otherwise. With this selection, we com- 
pute 




1 - 
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Truep^ 1 (GeV/c) 

FIG. 6. Correlation between the tag muon momentum 
and the total leptonic energy from b quark decay in MC 
tt events. The curve is the result of an empirical fit, 
1.313 + exp(3.101 - 0.6528?^) + exp(0.4622 - 0.06514p M ). 
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FIG. 7. The energy scale deviation AS as a function of 
for (a) data and (b) Monte Carlo. The curves are empirical 
multigaussian fits to the points. 
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FIG. 8. The relative energy scale difference between data 
and MC as a function of photon Et after all jet corrections are 
applied. The curves are the error band ±(2.5% + 0.5 GeV). 



and plot it as a function of rf? t . The result is shown 
in Fig. 0. This reveals detector inhomogeneities in the 
transition region between the central and end calorime- 
ters The curve from Monte Carlo is also seen to 
have a somewhat different shape than that from data. 
To remove these effects, we smooth the AS distributions 
by fitting them to the sum of several Gaussians, and scale 
each jet by 1/(1 + AS , (?7^ c G t t )). This is done separately for 
data and for Monte Carlo. 

To estimate the uncertainty in the relative scale be- 
tween data and Monte Carlo after all corrections, we 
derive AS* as a function of EZ (averaging over rf^t) f° r 
both data and MC after all corrections have been applied. 
The difference of the two is plotted in Fig. ||, along with a 
band of ±(2.5% + 0.5 GeV), which we use as our estimate 
of the systematic error of the jet energy calibration. (It 
is the relative data-MC difference that is relevant, rather 
than the absolute error, since the final mass is extracted 
by comparing the data to MC generated with known top 
quark masses.) 

A cross-check of these corrections is provided by (Z — > 
ee) + jets events. As shown in Fig. ^, the corrected 
jets satisfactorily balance the Z boson. We also show 
in Fig. [lO] the W — > qq and t — > bqq masses from it MC 
before and after the final two corrections. It is seen that 
the proper masses are recovered. 

The accuracy of these corrections depends on how well 
the Monte Carlo models jet widths. Studies of jets in 



D0 data show that herwig models the transverse en- 
ergy distribution within jets to within 5-10% |Q. Note, 
however, that since the determination of the response is 
done separately for data and for Monte Carlo, any dis- 
agreements would, to first order, be removed from the 
energy scale determination. There can still be second- 
order effects: for example, if jets in herwig were slightly 
too narrow, and if two jets were to overlap slightly, then 
the perturbation to the apparent jet energies due to that 
overlap would be slightly underestimated in the Monte 
Carlo. For this situation, we calculate that the fraction 
of the energy of a jet between R = 0.5 and R — 1.0 of 
the jet axis which leaks into the nearest jet is about 10%. 
We further find that this region in R contains about 10% 
of the total energy of a herwig jet. Thus, the leakage 
of energy from a jet to a neighbor is on the order of 1%. 
If the fraction of the jet energy outside of R = 0.5 is 
substantially larger in data than in herwig, e.g., 20%, a 
1% miscalibration would result. This is well within the 
errors we assign for moderate Et jets. 

V. EVENT SIMULATION 

Monte Carlo simulation is used to model the final 
states expected from top quark decays and their principal 
physics backgrounds. Although the overall background 
normalization is estimated using the observed data, the 
simulation is essential to determine the expected shapes 
of kinematic distributions. 
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A. Signal events 



40 



Mean: -0.138 
Width: 6.20 
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FIG. 9. Transverse energy balance for (Z — » ee) + jets 
events. The vector p£ + ^j cts &t ^ s projected onto the angle 
bisector of the two electrons. All jet corrections are applied. 
The curve is a Gaussian fit to the histogram. 



Our primary model for it production is the HERWIG 
generator, version 5.7, with CTEQ3M |l5j parton distri- 
bution functions, herwig models it production start- 
ing with the elementary hard process, choosing the par- 
ton momenta according to matrix element calculations. 
Initial and final state gluon emission is modeled using 
leading log QCD evolution Each top quark is then 
decayed to a W boson and a b quark, and final state 
partons are hadronized into jets. Underlying spectator 
interactions are also included in the model. 

For this analysis, samples are generated with top quark 
masses between 110 and 230 GeV/c 2 . To increase the ef- 
ficiency in the processing of lepton plus jets events, one of 
the W bosons is forced to decay to one of the three lepton 
families. Events with no final state electrons or muons are 
vetoed, and half of the events in which both W bosons de- 
cayed leptonically are discarded in order to preserve the 
proper branching ratios. The generated events are run 
through the D0GEANT detector simulation [|l7 18 and 
the D0 event reconstruction program. 

Additional samples are made using the isajet jl9| gen- 
erator to allow for cross-checks. 



B. W+jets background 




50 100 100 200 

Mass (GeV/c 2 ) 

FIG. 10. Masses of W -» qq and t -> bqq in it MC with 
m t = 175 GeV/c 2 , both (a), (b) with standard corrections 
only and (c), (d) with all jet corrections. The arrows locate 
the input W boson and top quark masses. 



The background due to the production of a If boson 
along with multiple jets is modeled using the VECBOS |2(]] 
event generator. VECBOS supplies final state partons as 
a result of a leading order calculation which incorpo- 
rates the exact tree level matrix elements for W and 
Z boson production with up to four additional par- 
tons. To include the effects of additional radiation and 
the underlying processes, and to model the hadroniza- 
tion of final state partons, the output of VECBOS is 
passed through herwig's QCD evolution and fragmen- 
tation stages. Since herwig requires information about 
the color labels of its input partons, it and VECBOS were 
modified to assign color and flavor to the generated par- 
tons. Flavors are assigned probabilistically by keeping 
track of the relative weights of each diagram contribut- 
ing to the process. Color labels are simply assigned ran- 
domly. To estimate systematic errors, we also generate 
samples which use isajet instead of herwig to fragment 
the VECBOS partons. We test the reliability of the HER- 
WIG and isajet simulations of higher order processes by 
comparing W+ four jet events generated using the VEC- 
BOS W+ four jet process to those generated using the 
W+ three jet process. 

Events are generated using the same parton distribu- 
tion functions assumed for the signal sample. The dy- 
namical scale of the process is set to be the average jet 
Pt- Systematic uncertainties arising from this choice are 
estimated by changing the scale to the mass of the W bo- 
son in a second sample of events. The background sam- 
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pies are processed through the detector simulation, re- 
construction, and event selection in the same manner as 
for the signal samples. 



C. QCD multijet background 

The non-VU QCD multijet background is estimated, 
both for the electron and the muon channels, using 
background-enriched data samples. In the former chan- 
nels, the sample consists of events containing highly elec- 
tromagnetic jets failing the electron identification cuts. 
In the latter, events are selected containing a muon 
which fails the isolation requirement, but which other- 
wise passes the muon identification cuts. 



VI. TOP DISCRIMINANTS 

The key feature that distinguishes top quark events 
from the W^+jets and QCD multijet backgrounds is the 
fitted mass mg t obtained from kinematic fits of the events 
to the top quark decay hypothesis. Since the top quark 
is heavy, the fitted mass tends to be larger for top quark 
events than for the backgrounds. Therefore, if both 
the signal to background ratio and the signal are large 
enough, we should see a clear signal peak in the m^ t dis- 
tribution. However, there is a caveat: this is true only if 
the cuts to enhance the signal to noise ratio do not sig- 
nificantly distort the fitted mass distributions. Unfortu- 
nately, powerful selection variables such as Ht = J2 
tend to be highly correlated with the fitted mass. Cuts 
on them thus introduce severe distortions in mfit which 
reduce the differences between the distributions for it sig- 
nal and background, and between the distributions for it 
signal at different top quark masses, thus impairing the 
mass measurement. 

This distortion of the met distribution can be avoided 
by using variables which are only weakly correlated with 
the fitted mass. The challenge is to find variables that 
also provide a useful measure of discrimination between 
signal and background. After an extensive search of vari- 
ables that exploit the expected qualitative differences be- 
tween the kinematics of top quark events and the back- 
grounds, we have succeeded in finding four variables x\- 
X4 with the desired properties. 

This success, however, comes at a price: the discrim- 
ination afforded by these variables tends to be weaker 
than that provided by variables, like Ht, that are mass 
dependent. But by treating these variables collectively, 
rather than applying a cut on each separately, we can 
compensate for their weaker discrimination. It is most 
effective to combine the variables into a multivariate dis- 
criminant 2?(x) with the general form 



D(x) = 



/«(*) 
/ s (x)+/ 6 (x) 



where x denotes the 4-tuple of mass-insensitive variables 
and / s (x) and /b(x) are functions that pertain to the sig- 
nal and background, respectively. We choose the func- 
tions f s and fb so that £>(x) is concentrated near zero for 
the background and near unity for the signal. 

In the following sections we describe the variables x\- 
Xi and the two complementary forms we have used for 
the functions / s (x) and /b(x). 



A. Variables 

The four variables {x\, X2, X3, 2:4} = x arc defined as 
follows: 



X\ = $ T 

x 2 = A 

x 3 = H T2 /H Z 



(6.2) 



Xi = AR™ a E? m /E T . 



w 



Our use of the variable x\ is motivated by the fact that 
top quark events have substantial missing transverse en- 
ergy, due to the neutrino from the leptonically-decaying 
W boson, while QCD multijet background events do not. 
Variable x 2 is the aplanarity A , which is defined in 
terms of the normalized momentum tensor of the jets and 
the W boson: 



M ab = ^PiaPib/ ^pf 



(6.3) 



where pi is the three-momentum of the ith object in the 
laboratory frame, and a, b run over x, y, and z. (For this 
and the remaining two variables, we use all jets satisfying 
E^ > 15 GeV and |77 jct | < 2.) The W boson momentum 
is defined by the sum of the lepton and neutrino mo- 
mentum vectors, where the z-component of the neutr ino 
momentum is determined as described in Sec. [II C. If 



the three eigenvalues of M ab are denoted Qj such that 



then 



Qi < Q2 < Q 3 



» 3 
A= 2 ( 



(6.4) 



(6.5) 



(6.1) 



This variable is a measure of the degree to which the final 
state particles lie out of a plane. In W + jets events, a 
high pt W boson recoils against a hadronic system that 
is typically dominated by a single high px jet. In QCD 
multijet events, two jets, perturbed by gluon radiation, 
recoil against each other. The signal, by contrast, has 
a momentum flow that is more spherical. It therefore 
has a larger aplanarity than do the backgrounds, which 
have more longitudinal topologies. (The aplanarity for 
top quark events is expected to decrease with increasing 
m t due to the W boson decay products becoming more 
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FIG. 11. Plot of Ht2 for the 77-event candidate sample, 
compared with the expectation for mt = 175 GeV/c signal 
plus background (filled circles), signal alone (open squares), 
and background alone (open triangles). (The normalizations 
are as in Fig. H.) 



collimatcd. This effect, however, is very small for mt < 
200 GeV/c 2 .) 

The variable Ht, as noted above, is a powerful dis- 
criminant between signal and background. But, since 
both the signal and background tend to have at least one 
high pt jet, we can improve the discrimination somewhat 
by removing the highest pt jet from Ht, yielding Ht2- 
A plot of this variable is shown in Fig. O. This variable, 
however, is correlated with the fitted mass. Therefore, 
we divide by another mass-sensitive variable, namely H z 
(equal to the sum of \p z \ of the lepton, neutrino, and the 
jets), in order to reduce that correlation. The longitu- 
dinal component of the neutrino momentum is found by 
the same method used to define rj w . We thus arrive at 
variable X3, which measures the centrality of the events 
- top quark events being more central than the back- 
grounds. 

The last variable, X4, is motivated by the observation 
that the four highest Et jets in top quark events have a 
different origin than the jets in VF+jets and QCD mul- 
tijet events. For it events, the four highest Et jets are 
mostly from the decay of the ti system. These jets tend to 
be widely separated in 77 — <f> space. For the backgrounds, 
usually at least one jet is the result of gluon radiation 
and is therefore somewhat closer to another jet, on aver- 
age, than the jets in ti events. Therefore, we are led to 
consider the six possible pairs of the four highest Et jets 
and take the pair with the minimum separation Ai?™ m in 
?y — <p space. We then multiply this minimum separation 
by the Et of the lesser jet of the pair, thus constructing 
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FIG. 12. The variables xi . . . X4 used as input to the top 
quark discriminants, for W + 3 jet control samples. His- 
tograms are data, and the circles are the expected signal + 
background mixture. 



a variable akin to the pt of one jet relative to another. 
Again, to reduce the correlation with mass, we divide by 



\E 



lop I 



another mass-sensitive variable, 

We have verified that the variables x\-xa are well mod- 
eled by our Monte Carlo calculations. Figure |l2| shows 
the observed distributions of these variables compared 
with the Monte Carlo predictions for a sample of 
jet events, which is dominated by background. In ad- 
dition, Fig. [l^ shows the distributions of these variables 
for the 77-event candidate sample, compared with Monte 
Carlo expectations. The Monte Carlo models the data 
well. We thus use these variables for the multivariate 
discriminants we now describe. 



B. Likelihood discriminant 

The correlations among the variables X1—X4 are small. 
Although we may not conclude that the variables are, as 
a consequence, independent, experience shows that it is 
frequently true that weakly correlated variables are also 
nearly independent. We assume this to be true for x\—x± 
and write the functions f s and /& as 



i=l 
4 

/ 6 (x)=ij6r(^). 



(6.6) 
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FIG. 13. The variables xi . . . X4 used as input to the top 
quark discriminants, for the 77-event candidate sample (his- 
togram), ti signal plus background for m t = 175 GeV/c 2 
(filled circles), signal alone (open squares), and background 
alone (open triangles). (The normalizations are as in Fig. |^.) 



where Sj(x,) and bi(xi) are the normalized distributions 
of variable Xi for signal and background, respectively. 
These forms reduce to the usual likelihood function for 
strictly independent variables when the weights mj = 1. 
With the weights adjusted slightly away from unity, we 
can nullify the correlation between mm and th^ discrim- 
inant Dlb( x ) formed from Eqs. (6.1) and (6.6), while 
maintaining maximal discrimination between high-mass 
(> 170 GeV/c 2 ) top events and the background. The 
subscript "LB" (= "low bias") denotes the fact that cuts 
on 2?lb introduce negligible bias (that is, distortion) in 
the met distributions. 

We have found it useful to have a parameterized form 
for the discriminant £>lb- Rather than directly parame- 
terizing the functions f s and ft, it is simpler to param- 
eterize the ratio C = f s /fb by using polynomial fits to 
the four functions Ci = Si{xi) /bi{xi) and then computing 
C = expJ2iWilnCi ||. We then find V LB =£/(l+£). 

We also make use of cuts based on 2?lb and Hti- 
All tagged events pass this "LB selection" ; for untagged 
events, we require: 



• £> LB > 0.43 and 

• H T 2 > 90 GeV. 

This selection is used in several places to separate the 
sample into signal-rich and background-rich portions. 
The cut £>lb > 0.43 was chosen to minimize the error on 
the top quark mass when analyzing Monte Carlo sam- 
ples. The Ht2 cut removes very little signal for the top 
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FIG. 14. The discriminant variables (a) Dlb and (b) Dnn 
plotted for the m t = 175 GeV/c 2 ti (hatched) sample and 
the simulated background (unhatched). All histograms are 
normalized to unity. 



quark masses of interest (see Fig. |ll]), but provides an 
easy way of further reducing the background. 



C. Neural network discriminant 

The variables x\—x± were chosen to have minimal cor- 
relations with the fitted mass. We therefore consider 
a second, complementary, discriminant in which no at- 
tempt is made to nullify the correlation between the dis- 
criminant and the fitted mass. We do attempt, however, 
to account for the small correlations that exist among 
the variables x\-x<±. This discriminant, denoted by X>nn, 
is calculated with a neural network (NN) having four 
input nodes, three hidden nodes, and a single output 
node, whose value is £>nn- The network is trained using 
the back-propagation algorithm provided in the program 
JETNET V3.0 p3| using the default training parameters. 
We use herwig ti Monte Carlo with m t = 170 GeV/c 2 
as the signal, and VECBOS W + jets events as the back- 
ground (equal numbers of each). During training, the 
target outputs are set to unity for the signal and zero for 
the background. Under these conditions, the network 
output approximates the ratio s(x)/[s(x) + 6(x)] 
where s(x) is the normalized density for the signal and 
6(x) is the normalized density for the background. Since 
the correlations among xi . . . X4, are small, as are the cor- 
relations with the fitted mass, we should anticipate that 
the discriminants X>lb arL d 2?nn will provide comparable 
levels of signal to background discrimination. That this 
is true is evident, qualitatively, from Fig. [l4| which com- 
pares the distributions of Z?lb and £>nn for top quark 
events and for the mixture of VF+jets and QCD multijet 
events appropriate for the precuts discussed earlier. The 
dependence of the discriminants on the top quark mass 
is indeed small, as shown in Fig. |TJ|. In Fig. [l6[ we com- 
pare the distributions of the two discriminants obtained 
from the candidate sample to those predicted from Monte 
Carlo; the agreement is quite good. 

Analogous to the LB selection, we will also make use 
of a cut on 2?nn- This "NN selection" is defined by 
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2?nn > 0.6. This cut value yields roughly the same dis- 
crimination as the LB selection. 
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FIG. 15. The discriminant variables (a) Vlb and (b) Dnn 
for it Monte Carlo with m t = 150 GeV/c 2 (dashed lines), 
m t = 175 GeV/c 2 (solid lines), and m t = 200 GeV/c 2 (dotted 
lines). All histograms are normalized to unity. 



VII. VARIABLE-MASS FIT 

A. Introduction 

The method used can be summarized as follows. For 
each event in the precut sample, we perform a con- 
strained kinematic fit to the hypothesis it — ► I + jets to 
arrive at a "fitted mass" mgf Events which fit poorly are 
discarded. For each event, we also compute a top quark 
discriminant T> (either X>lb or £>nn)- The events are then 
entered into a two-dimensional histogram in the (D, m^x) 
plane. Similar histograms are also constructed for a sam- 
ple of background events and for signal Monte Carlo at 
various top quark masses. For each of these MC masses, 
we fit a sum of the signal and background histograms to 
the data histogram. This fit yields a background frac- 
tion and a corresponding likelihood value. These like- 
lihood values are then plotted as a function of the top 
quark mass, and the final result extracted by fitting a 
quadratic function to their logarithms. 



B. Kinematic fit 
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FIG. 16. The discriminant variables (a) X>lb and (b) X>nn 
for the 77-event candidate sample (histogram), it signal plus 
background (filled circles), and background alone (open tri- 
angles). The binnings were chosen such that the predicted 
signal plus background distribution would be approximately 
flat. 



The goal of the kinematic fit is to constrain a measured 
event to the hypothesis 

pp^tt + X -> (W+b)(W-b) + X -» {lvb){qqb) + X 

(7.1) 

(or the charge conjugate) and thus arrive at an estimate 
met of the top quark mass. There is a complication, how- 
ever, in that when reconstructing the event, we do not 
know a priori which observed jet corresponds to which 
parton. In fact, due to QCD radiative effects, jet merg- 
ing and splitting during reconstruction, and jet recon- 
struction inefficiencies, the observed jets may have no 
one-to-one correspondence with the unfragmented par- 
tons from the it decay. Nevertheless, the fitted mass mat 
constructed from the observed jets is correlated with the 
true top quark mass and can thus be used for a measure- 
ment; however, mat should not be thought of as "the top 
quark mass" for a particular event. 

The inputs to the fit are the kinematic parameters of 
the lepton, the jets, and the missing transverse energy 
vector $ T . Only the four jets with the largest Et within 
\r)\ < 2.5 are used in the fit (any additional jets are as- 
sumed to be due to initial state radiation). We parame- 
terize electrons and jets in terms of energy E, azimuthal 
angle <f), and pseudorapidity r\. For muons, we param- 
eterize the momentum in terms of k = 1/p, since the 
resolution is more nearly Gaussian in that variable. The 
muon direction is also represented as (<f), 77). Leptons and 
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light quarks are fixed to zero mass; b quarks are fixed to 
a mass of 5 GeV/c 2 . The transverse momentum of the 
neutrino is taken to be ]£ T . However, we do not use $ T 
directly in the fit, as it is correlated with all the other 
objects in the event. Instead, we use the x and y com- 
ponents of 



k T = $ T + R 



4 jets 



(7.2) 



This can be thought of as the transverse momentum of 
the it pair. Note that this is not necessarily a small quan- 
tity if the event has more than four jets. One additional 
variable is needed to uniquely define the event kinemat- 
ics: we take that to be the z-component of the neutrino 
momentum p v z . This variable is not measured, but is de- 
termined by the fit. This gives a total of 18 variables. 

With this parameterization, there are three kinematic 
constraints which can be applied: 



m(t — » Ivb) = m(t — » qqb) 
m{lv) = M w 
m(qq) = M w . 



(7.3) 



Three constraints and one unmeasured variable allow for 
a 2C fit. 

Since we do not know the correspondence between jets 
and partons, we try all twelve distinct assignments of the 
four jets to the partons (bbqq). (But if the event has a 
&-tag, only the six permutations in which the tagged jet 
is used as a 6 quark are considered.) Once a permutation 
is chosen, we apply the parton-level and ^-dependent jet 
corrections described in Sec. IV. We apply a loose cut on 
the hadronic W boson mass before the fit: 40 < m(qq) < 
140 GeV/c 2 . Permutations failing this cut are rejected 
without being fit in order to speed up the computation. 
We arrange the measured variables into a vector x m and 
form the \ 2 



X 2 = (x-x m ) T G(x-x m ), 



(7.4) 



where G is the inverse error matrix. This x 2 is then min- 
imized subject to the kinematic constraints of Eq. (7.3). 
The minimization algorithm uses the method of La- 
grange multipliers; the nonlinear constraint equations are 
solved using an iterative technique. (The algorithm used 
is very similar to that of the SQUAW kinematic fitting 
program j2f|; a detailed description may be found in 
Ref. Ml -) If this minimization does not converge, the 
permutation is rejected. A permutation is also rejected 
if x 2 > 10- F° r each surviving permutation, this method 
gives a fitted mass m^t and a \ 2 ■ We pick the mat value 
corresponding to the smallest \ 2 as TO fit for the event. 

There is one additional wrinkle to the above procedure. 
In order to start each fit, we must specify an initial value 
for the unmeasured variable p u z . We choose it so that 
the two top quarks are assigned equal mass. This yields 
a quadratic equation for p v z . If the solutions are com- 
plex, the real part is used. Otherwise, there are two real 



solutions. Both are tried, and the fit which gives the 
smaller x 2 is retained. Note, however, that since p z does 
not enter into the \ 2 (its measurement error is effectively 
infinite), the only effect its initial value can have on the fi- 
nal result is to influence which local minimum the fit will 
find, should there happen to be more than one. In the 
majority of cases, two distinct neutrino solutions yield 
nearly the same fit result. 

The error matrix G _1 is taken to be diagonal. The 
resolutions used are given in Table VI. (The lepton an- 



gular resolutions are much smaller than the other resolu- 
tions, and can be taken to be effectively zero.) In most 
cases, these resolutions were derived from it Monte Carlo 
events by comparing reconstructed objects to generator- 
level objects. 

Results of this procedure on Monte Carlo it samples 
are shown in Fig. [l7|. Figure [l7](a) shows results using the 
herwig partons directly, before any QCD evolution has 
taken place. A rather sharp peak is seen; further, about 
80% of the time, the permutation with the lowest x 2 is 
the one which is actually correct. The residual width seen 
in the plot is due mainly to the non-zero widths of the W 
bosons. Figure |T^(b) shows results from the same sample, 
but after QCD evolution and jet fragmentation. The fi- 
nal state particles are clustered together in cones of width 
AB, = 0.5 in order to simulate the action of the jet re- 
construction algorithm. This distribution is considerably 
broader. There are fewer events in the hatched plot be- 
cause it is not always possible to uniquely define the cor- 
rect permutation. Due to splitting and merging effects, 
jet finding inefficiencies, and jets falling below the selec- 
tion threshold, the correct permutation can be uniquely 
identified in only about 50% of events. In that case, the 
correct permutation is the lowest \ 2 permutation about 
40% of the time. Finally, Fig. |lj](c) shows results for a 
sample which has been through the full detector simula- 
tion and reconstruction. The resulting distribution has 
essentially the same width as that of Fig. |l7|(b) ; this in- 
dicates that the dominant contribution to the width of 
this distribution comes from QCD radiation and jet com- 
binatoric effects, and not from the detector resolution. 

The (MC) fit x 2 distributions resulting from the fit to 
the correct jet permutation are shown in Fig. [l^. The 
distributions agree reasonably well with the expectations 
for a two degree-of- freedom \ 2 '■> except for a tail at the 
high end due to non-Gaussian tails in the resolutions. 
The (MC) mfit distributions for the four channels are 
shown in Fig . [is| . 

Figure EG shows the distributions which result after 
the jets in each Monte Carlo event are scaled up or down 
by the per-jet systematic error of 2.5% + 0.5 GeV. This 
shifts the fitted mass by approximately ±3.7 GeV/c 2 . 

Figure ^l] shows the fitted mass distribution for several 
top quark masses and for the background. 

A possible objection to the fit method described here is 
that it does not take into account the intrinsic widths of 
the W boson and top quark decays. To investigate this, 
an alternate fitting method was tried which explicitly in- 
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TABLE VI. Object resolutions. The operator denotes a sum in quadrature. 



Energy resolution a {4>) a { r l) 

Electrons a(E T )/E T = 0.0157 0.072 GeV 1/2 /VE T © 0.66 GeV/E T 

Muons o-(Vp) = C a © 0.2/p 

Jets 

< |r?det| < 0.8 a(E)/E = 0.036 ©1.145 GeY 1/2 /*jE 0.04 rad 0.04 

0.8 < Irjdctl < 1-4 a(E)/E = 0.082 © 1.264 GcY 1/2 /y/E 0.05 rad 0.05 

1.4 < Irjdctl < 2.0 a(E)/E = 0.046 © 1.305 GeV 1/2 /V~E 0.05 rad 0.05 

k T cr(k Tx ) = a(k Ty ) = 12 GeV 

a C = 0.0045/(GeV/c) if the muon track could be matched with a track in the central detector; C = 0.01/(GeV/c) otherwise. 
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FIG. 17. Tests of kinematic fit method on it Monte Carlo 
samples (m t = 170 GeV/c , e+jets channel), (a) Using her- 
wig partons directly, (b) Final state Monte Carlo particles, 
after clustering into R = 0.5 cones, (c) After full detector 
simulation and reconstruction. The hatched plots show the 
results for the correct jet permutation (regardless of whether 
or not it has the lowest \ 2 )- Displayed means and widths are 
from a Gaussian fit, shown by the dashed curve. 
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FIG. 18. Fit x 2 distributions for the correct jet permuta- 
tion for it Monte Carlo samples (m t = 170 GeV/c 2 ). The 
dashed curve is the \ 2 distribution for two degrees of free- 
dom, normalized to the area of the histogram. 
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FIG. 19. Fitted mass distributions for it Monte Carlo sam- 
ples (mt = 170 GeV/c 2 ) for the jet permutation with the 
lowest x 2 ■ Hatched histograms show the results for the cor- 
rect jet permutation (regardless of whether or not it has the 
lowest x 2 )- Displayed means and widths are from a Gaussian 
fit, shown by the dashed curve. 



600 -r 



> 

o 




Fitted mass (GeV/c ) 

FIG. 20. Fitted mass distributions for tt Monte Carlo sam- 
ples (rrit = 170 GeV/c 2 , e + jets channel). With jets scaled 
(a) down and (b) up by 2.5% + 0.5 GeV. Hatched histograms 
show the results for the correct jet permutation (regardless 
of whether or not it has the lowest x 2 )- Displayed means are 
from a Gaussian fit, shown by the dashed curve. 
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FIG. 21. Fitted mass distributions, all channels combined. 
Shown is tt Monte Carlo with (a) m t = 150 GeV/c 2 , (b) 
mt = 170 GeV/c 2 , and (c) m t = 190 GeV/c 2 and (d) back- 
ground. The hatched distributions are after the LB selection 
is applied. 



corporates these widths. This method is based on a stan- 
dard unconstrained minimization package (minuit p7|) . 
The quantity minimized is the x 2 as defined in Eq. ( |7~4| ) 
with three Breit-Wigner constraint terms added: two for 
the two W bosons, and one for the top quark mass dif- 
ference: 



X 2 =X 2 -21n 

1 iW 4 ■+■ \m\iv) - mwr 
2ln _ n ,. r V 4 (7.5) 
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T 2 w /4 + (m(qq) 
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rf 





Tf + {m(lvb) - m(qqb)) 2 

(The factor of 4 difference in the last term comes 
from convoluting two Breit-Wigner functions centered on 
m(lvb) and m(qqb).) The W boson width is taken to be 
2 GeV/c 2 . The top quark width is taken to depend on 
the mass as T t = (amt) 3 ', the proportionality constant 
a is set so that T t = 0.6 GeV/c 2 at m t = 140 GeV/c 2 . 
(Here, m t = (m(lvb) + m{qqb))/2.) These widths are 
small compared to the experimental resolutions. The re- 
sults of this procedure are compared to those from the 
Lagrange-multiplicr based fitter in Fig. ^2|. In most cases, 
the results are nearly identical, implying that neglecting 
the widths is not a serious problem. Since this algorithm 
takes several times longer to execute, it is not used fur- 
ther. 
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FIG. 22. Differences between the results obtained from 
the MlNUlT-based fitter and the Lagrange-multiplier based 
fitter for (a) mat and (b) \ 2 - (F° r tt Monte Carlo with 
m t — 170 GeV/c 2 , e + jets channel.) 



C. Likelihood fit 

The next problem to be solved is the extraction of the 
top quark mass from the data sample, which is a mixture 
of signal and background. This is done using a binned 
Poisson-statistics maximum-likelihood fit at discrete top 
quark masses. (The method is described in more detail 
in Ref. @.) 

We bin the data according to some characteristics of 
the events. (For this analysis, we will be using mat and 
either Z?lb or 2?nn-) Call the number of bins M, the 
total number of events N, and the number of events in 
each bin Nj. 

We also know the distribution expected for different 
values of the top quark mass, and also for the back- 
ground. (This is from Monte Carlo except for the QCD 
multijet background.) For both the signal and back- 
ground, we have a distribution of events among the M 
bins; call the numbers of events in each bin of these dis- 
tributions Aj and Aj. 

We regard these distributions as drawn from "true" 
distributions a* and aj, and write the probability for see- 
ing the observed data set D given these parameters as a 
Poisson likelihood 

M 

L(D\A,a,p) = \{ q (N 1 ,p s a]+p b a b } )q(A s J ,a])q{A b J1 a b } ) 1 

3=1 

(7.6) 

where q is the Poisson distribution q(N,a) = e~ a a N /Nl 
and p s and pb are the signal and background strengths. 
These strengths can be related to the number of expected 
events n s and rib by p s — n s j (M + Aj), and similarly 
for rib- (The M term in the denominator ensures that 
the sum of the maximum likelihood estimates for n s and 
rib equals N. See Rcf. p3] for further discussion. Note 
that usually M <C • Aj.) The total number of events 
expected is thus rij = p s a s j + pba h j. We eliminate the a^'s 
from this likelihood by integrating over them; the result 



L(D\A, P ) = HJ2- 



Nj-k 
Pb 



A* + k} (A) + Nj - k 
Nj-k 



(7.7) 



Following Ref. ||, we then modify the likelihood by di- 
viding by the constant factor 



Y[q(Nj,Nj). 



(7. 



This has the effect of making the quantity — 2 In L behave 
asymptotically like a % 2 distribution. (Note, however, 
that for our experiment, the sample size is too small for 
this asymptotic behavior to be accurately realized.) 

We now have a set of signal models, each correspond- 
ing to a different top quark mass mt- For each signal 
model, we fit it plus the background to the data, yield- 
ing n s and ru. A maximum likelihood fit is used, based 
on MINUIT HtJ . The minimum value of — In L is retained; 
call this — In L m i n . The resulting values of (m t , — In L ni i n ) 
then define a likelihood curve as a function of top quark 
mass. 

We also define a statistical error on — In £ m i n due to 
the finite Monte Carlo statistics. This is done by the 
simple method of taking in turn each bin j in the input 
Monte Carlo histograms, varying the contents up or down 
by ^/Aj, and re-evaluating the likelihood. (To save time, 
the fit for n s and rib is not redone for each variation; early 
testing showed it to make very little difference.) The re- 
sulting variations in — In L m ; n for each bin are then added 
in quadrature. This error is calculated separately for the 
signal and background samples; however, any effects from 
fluctuations in the background sample will be highly cor- 
related from mass point to mass point. Thus, the errors 
shown on the plots and used in the fit below come from 
the signal samples only. 

The final step is to extract a mass value from this set of 
{mt, — lnL m ; n ) points. This is done by fitting a quadratic 
function to the smallest — In L m i n and the four closest 
points on each side. The points are weighted by the sta- 
tistical errors assigned to the — lnL m i n values. The po- 
sition of the minimum of this quadratic defines the mass 
estimate, and its width (where the curve has risen by 0.5) 
gives an error estimate. We also want estimates for n s 
and rib- For each mass m t , we have a separate estimate 
for n s and rib returned from minuit. The final estimates 
of these values are determined by a linear interpolation 
between the two points bracketing the final mt estimate. 
The errors are found in the same manner. 

For comparison, some results are also given using 
11 points instead of 9 for the polynomial fit, and using a 
cubic function instead of a quadratic one. 
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TABLE VII. NN bin definitions. 



Bin 2?nn range 

1 0.000 - 0.105 

2 0.105 - 0.166 

3 0.166 - 0.257 

4 0.257 - 0.373 

5 0.373 - 0.488 

6 0.488 - 0.595 

7 0.595 - 0.687 

8 0.687 - 0.766 

9 0.766 - 0.846 
10 0.846 - 1.000 



D. Fitting variables and binning 

From each event, we derive two variables: the fitted 
mass mat an d a discriminant T>. We use these variables 
to bin the data into a two-dimensional histogram. The 
top quark mass is then extracted from a fit to the expec- 
tations from Monte Carlo, as described in the previous 
section. 

Two different discriminants and histogram binnings 
are used. For both binnings, the fitted mass axis has 
twenty bins of width 10 GeV/c 2 over the range 80 to 
280 GeV/c 2 . They differ in the definition of the discrim- 
inant axis. For the "LB" analysis, the discriminant axis 
is divided into two bins, the first bin contai ning e vents 
which fail the LB selection (as defined in Sec. VI B| ), and 
the second containing events which pass it. (Recall that 
all tagged events pass the LB selection.) For the "NN" 
analysis, the discriminant axis is the NN variable X>nn- 
(Note that tagging information is not used in forming 
Pnn-) , Th ere are ten unevenly spaced bins, as defined in 
Table |VIl| . These bin boundaries were chosen so that the 
expected signal + background distribution populates the 
bins approximately uniformly. There are thus 40 bins in 
the LB binning, and 200 bins in the NN binning. Exam- 
ples of the resulting histograms are shown in Fig. ^3|. 

These histograms are generated separately for each of 
the four channels. They are then combined using the set 
of fixed weights given in Table VIII. We derive these 
numbers by calculating the expected signal and back- 
ground in each channel using the same techniques as used 
for the cross section measurement || (except that only 
the precuts are applied). We also combine the histograms 
for VECBOS W + jets background and the QCD multijet 
background using a fixed QCD fraction of (22 ± 5)%, de- 
rived in the same manner. 



(a) LB, m, = 175 GeV/c 
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(b) NN, m t = 175 GeV/cT 



(c) LB, W+ jets 
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(d)NN, W+jets 
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(f) NN, QCD Multijet 
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FIG. 23. Monte Carlo histograms for LB and NN anal- 
yses for ti Monte Carlo with mt = 175 GeV/c 2 , VECBOS 
W + jets background, and QCD multijet background. More 
top quark-like events are towards the top of the plots. 



E. Fits to data 



The results of the kinematic fit fo r the candidate events 
are given in Tables IX through XII . (Complete details of 
the candidate events are available in Ref. E9].) There are 
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TABLE VIII. Fraction of events expected in each channel after the precuts. 



herwig tt 

110-150 GeV/c 2 
155-170 GeV/c 2 
172-190 GeV/c 2 
195-230 GeV/c 2 

VECBOS 

QCD 



e + jets 

0.376 ± 0.020 
0.418 ±0.018 
0.427 ± 0.016 
0.416 ±0.014 
0.531 ±0.077 
0.443 ±0.111 



e + jets//i 

0.085 ±0.013 
0.097 ±0.011 
0.093 ±0.010 
0.097 ± 0.009 
0.015 ±0.017 
0.013 ± 0.030 



fi ± jets 

0.468 ± 0.025 
0.425 ± 0.021 
0.409 ± 0.019 
0.419 ±0.018 
0.441 ± 0.079 
0.488 ±0.115 



fx ± jets//i 

0.071 ±0.018 
0.059 ±0.015 
0.071 ±0.013 
0.068 ±0.012 
0.013 ±0.003 
0.056 ± 0.020 
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FIG. 24. Fitted mass distributions for candidate events. 
The hatched histograms show the LB subsample. 

91 events passing the precuts (PR). One of these, how- 
ever, had no successful fits, and is not considered further. 
Thirty-six of these events then pass the LB selection. The 
distributions of the fitted masses of these candidates are 
shown in Fig. When the x 2 < 10 cu t is imposed, 
there are 77 PR events and 31 LB events. Distributions 
of their fitted masses are shown in Fig. T he x 2 distri- 
bution of the 90 events is shown in Fig. |26|. It compares 
well to the expectation from Monte Carlo. 

Result s of li kelihood fits to the data sample arc shown 
in Table XIII. Several methods of extracting the final 



top quark mass are tabulated. The labels "quadA^" and 
"cubA^" denote, respectively, Appoint quadratic and cu- 
bic fits to the negative log likelihood values. The reported 
central value is the minimum of the fit curve, and the er- 
ror indicated is the width of the curve where it has risen 
by 0.5 from the minimum. For the "avg" fits, the central 
value is the mean of the likelihood curve (calculated using 
trapezoidal-rule integration), and the reported error on 
the mass is the symmetric interval around the mean con- 
taining 68% of the likelihood. Table Kill also shows the 



> 

O 




Fitted mass (GeV/c z ) 

FIG. 25. Fitted mass distributions for candidate events 
with x 2 < 10. The hatched histograms show the LB sub- 
sample. 
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TABLE X. Same as Table |x| for the fj, + jets channel. 



TABLE IX. Kinematic fit results and top quark discrimi- 
nants for events in the e+jets channel for the jet permutation 
with the smallest \ 2 ■ The "Perm" column gives the assign- 
ment of the jets to partons, listed in order of decreasing jet 
Et- Bi and Bh denote the b quarks associated with the lep- 
tonically and hadronically decaying top quarks, respectively, 
while W denotes the quarks from the hadronically decaying 
W boson. The fitted mass mat is in GeV/c 2 . 



Run Event 



Perm. 



mat 



X 



b c 

a b c 

b c 

a b c 

b c 

b c 



b c 

a b c 

b c 

a b c 

a b c 

b c 

b c 

b c 



a b c 
b c 
b c 
b 

a b c 
b c 
a 

a b c 
a b c 
b c 
b c 
b c 
b 
c 

b c 
b c 
b c 
b c 
b c 

a b c 
b c 
a 
b c 

a b c 



62199 
62431 
63066 
64464 
81949 
82024 
82220 
82996 
84331 
84890 
85917 
86518 
86601 
87063 
87104 
87329 
87446 
88038 
88044 
88045 
88125 
88463 
88588 
89484 
89550 
89708 
89936 
89972 
90108 
90435 
90496 
90693 
90795 
90804 
91923 
92013 
92217 
92278 
92673 
94750 
96329 
96676 
96738 



15224 
788 
13373 
21611 
12380 
44002 
20012 
24461 
13271 
28925 
22 
11716 
33128 
39091 
25823 
13717 
14294 
14829 

9807 
35311 
15437 

3627 
15993 
11741 
18042 
24871 

6306 
13657 
31611 
32258 
28296 

8678 
14246 

6474 
502 
11825 
109 
21744 

4679 

4683 
13811 
79957 
27592 



BiWWBh 
WB h B,W 
BiWB h W 
B h WWB l 
B t B h WW 
WBiB h W 
B h WBiW 
WB h WBi 
B h WBiW 
B h WBiW 
BtWWB h 
B h WWBi 
WBiB h W 
B h WBiW 
WB h BiW 
B h BiWW 
WWB h Bi 
WWB h Bi 
WBiWB h 
WBnWBt 
WBnWBi 
WBhWBi 
WWBhBi 
BhBiWW 
WWBhBi 
WBhBiW 
WBhBiW 
WBhBiW 
WBhWBi 
BhBiWW 
BhWBiW 
BhBiWW 
BhWBiW 
WWBiBh 
WBhWBi 
WBhBiW 
WWBhBi 
WBhWBi 
BiBhWW 
BhWWBi 

WBiB h W 
BiWWBh 



265.4 
241.7 
206.8 
115.7 
132.7 
130.2 
120.8 
166.8 
116.8 
126.4 
162.3 
243.5 
179.2 
188.4 
119.9 
242.1 
118.3 
101.0 
145.2 
178.2 
115.9 
111.7 
103.4 
135.0 
103.5 
144.6 
220.4 
176.7 
137.4 
154.1 
112.9 
105.5 
193.9 
114.2 
162.1 
134.1 
107.8 
125.9 
267.7 
201.5 

224.1 
236.6 



15.9 
0.23 
1.35 
0.64 
1.10 
0.97 
2.53 

31.8 

14.4 
0.78 
2.26 
0.54 
0.39 
0.39 
2.11 
1.95 
1.11 

12.8 

34.0 
2.71 
0.16 
9.93 
7.44 
0.76 
0.07 

20.1 
1.29 
9.08 
0.41 
1.05 
0.28 
8.98 

12.8 
0.64 
0.14 
3.68 
0.58 
7.26 
1.85 
3.63 

0.47 
5.68 



0.09 
0.16 
0.85 
0.22 
0.77 
0.06 
0.03 
0.73 
0.25 
0.06 
0.79 
0.18 
0.43 
0.58 
0.06 
0.39 
0.59 
0.37 
0.09 
0.83 
0.78 
0.16 
0.29 
0.53 
0.30 
0.62 
0.50 
0.65 
0.21 
0.27 
0.23 
0.51 
0.09 
0.34 
0.09 
0.11 
0.77 
0.17 
0.92 
0.32 
0.54 
0.36 
0.60 



0.21 
0.09 
0.95 
0.31 
0.82 
0.31 
0.06 
0.74 
0.27 
0.07 
0.81 
0.29 
0.29 
0.63 
0.09 
0.23 
0.52 
0.28 
0.11 
0.81 
0.74 
0.46 
0.30 
0.58 
0.27 
0.74 
0.68 
0.77 
0.21 
0.62 
0.19 
0.27 
0.07 
0.59 
0.15 
0.15 
0.82 
0.31 
0.97 
0.49 
0.79 
0.46 
0.83 



b c 
b c 



b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 



b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 

b c 



Run 


Event 


Perm. 


mm 


x 2 


£>LB 


61514 


4537 


BhWBiW 


120.8 


3.40 


0.26 


63183 


13926 


WWBhBi 


133.7 


1.26 


0.84 


63740 


14197 


BiWBhW 


185.3 


2.56 


0.94 


80703 


31477 


WBhBiW 


167.2 


0.54 


0.24 


81909 


11966 


B h WBiW 


162.9 


1.11 


0.67 


81949 


13778 


WB h WBi 


109.2 


8.25 


0.27 


82639 


11573 


WBiWB h 


117.3 


2.24 


0.35 


82694 


25595 


WBiWB h 


114.0 


2.03 


0.56 


84696 


29253 


WB h BiW 


221.0 


1.05 


0.74 


84728 


18171 


B h B{WW 


136.0 


3.65 


0.40 


85888 


28599 


B h WWBi 


189.6 


5.78 


0.18 


87063 


14368 


WWBhBi 


182.1 


0.02 


0.50 


87604 


14282 


BiWWBh 


90.6 


40.6 


0.14 


87820 


6196 


B h BiWW 


178.0 


17.8 


0.87 


88464 


2832 


BhWBiW 


154.1 


0.14 


0.87 


88530 


7800 


WBiB h W 


151.2 


0.08 


0.62 


88597 


1145 


WWBhBi 


124.6 


10.2 


0.20 


88603 


2131 


WBiWB h 


123.7 


0.66 


0.13 


89751 


27345 


BhWWBi 


132.4 


1.14 


0.15 


89943 


19016 


WB h BiW 


163.7 


0.03 


0.65 


90133 


14110 


WB h WBi 


169.4 


4.88 


0.26 


90660 


20166 


WBiB h W 


222.6 


1.28 


0.70 


90690 


12392 


B h WBiW 


153.3 


0.58 


0.70 


90836 


14924 


WBiWB h 


147.4 


3.13 


0.07 


90864 


17697 


WB h WBi 


96.6 


0.81 


0.44 


91359 


15030 


WB h WBi 


118.9 


1.81 


0.54 


92081 


3825 


WB h B t W 


117.7 


3.72 


0.07 


92082 


34466 


WBhBiW 


176.2 


0.30 


0.31 


92114 


1243 


WBiBhW 


187.0 


11.7 


0.96 


92126 


21544 


BiWWBh 


157.2 


0.02 


0.82 


92142 


27042 


WBiBhW 


148.7 


4.71 


0.24 


92226 


34133 


WBhBiW 


140.3 


0.49 


0.41 


92714 


4141 


WWBhBi 


106.4 


6.28 


0.43 


92714 


12581 


BiBhWW 


166.3 


1.66 


0.57 


94750 


1147 


WWBiBh 


126.9 


0.82 


0.32 


96258 


2707 


BiWBhW 


171.2 


1.02 


0.49 


96264 


93611 


BhWWBi 


111.7 


0.41 


0.06 


96280 


14555 


WB h BiW 


133.8 


0.07 


0.69 


96287 


20104 


WB h BiW 


182.5 


5.64 


0.16 


96399 


32921 


BiBhWW 


172.8 


0.28 


0.68 


96591 


39318 


B h BiWW 


174.3 


0.94 


0.55 



NN 



a Passes LB selection. 

Used in variable-mass analysis. 
c Used in pseudolikclihood analysis. 



0.59 
0.83 
0.96 
0.40 
0.66 
0.25 
0.47 
0.53 
0.89 
0.38 
0.09 
0.72 
0.38 
0.97 
0.93 
0.60 
0.42 
0.17 
0.14 
0.74 
0.28 
0.90 
0.78 
0.08 
0.62 
0.60 
0.40 
0.49 
0.96 
0.91 
0.21 
0.66 
0.59 
0.66 
0.23 
0.28 
0.14 
0.68 
0.14 
0.83 
0.75 



TABLE XL Same as Table IX for the e + jets/^, channel 



Run Event 



Perm. 



X 



©LB 2?NN 



a Passes LB selection. 

b Used in variable-mass analysis. 

c Used in pseudolikelihood analysis. 



a 62199 13305 BiB h WW 

b c 85129 19079 WB t B h W 

b c 86570 8642 B h WWBi 

a 89372 12467 B h WWBi 



173.2 
137.0 
144.5 
186.6 



40.0 
0.93 
0.66 

22.1 



0.55 0.61 

0.81 0.85 

0.74 0.29 

0.23 0.25 



a Passes LB selection. 

b Used in variable-mass analysis. 

c Used in pseudolikelihood analysis. 



23 



TABLE XII. Same as Table IX for the n + jets/p channel 





Run 


Event 


Perm. 


m flt 


x 2 


£>LB 


2?NN 


a b 


c 58203 


4980 


WB h BiW 


138.3 


0.25 


0.56 


0.62 


a b 


c 91712 


22 


B h WBiW 


203.3 


0.44 


0.51 


0.44 


a b 


c 92704 


14022 


WB h B t W 


175.8 


0.11 


0.79 


0.88 



a Passes LB selection. 

b Used in variable-mass analysis. 

c Used in pseudolikelihood analysis. 



result for the "NN2" binning. This is a variant of the NN 
binning which uses only two bins in 2?nn : both the first 
six bins and the last four bins are coalesced. The result 
is seen to be consistent with the 10-bin NN analysis. 

For our final result, we use the nine-point quadratic 
fit. This choice is motivated by a desire to use a sim- 
ple functional form; furthermore, it will be seen in the 
next section that among the polynomial fits considered, 
it gives the slope closest to unity when one plots ex- 
tracted mass versus Monte Carlo input mass. The re- 
sulting mass is then 174.0 ± 5.6 GeV/c 2 for the LB bin- 
ning, and 171.3 ± 6.0 GeV/c 2 for NN. These fit results 
are exhibited in Figs. |27|-|3"o|. 

Note in Fig. ^8] that — h\L tends to flatten out away 
from the minimum. Due to this, we limit the polyno- 
mial fit to the central region, where — In L is most nearly 
quadratic. This flattening is related to the fact that we 
do not impose an external constraint on the number of 
signal or background events in the likelihood fit. If such a 
constraint is imposed, as was done in Ref. ||, the — InL 
curve shows less tendency to flatten. 

To use more likelihood points in the fit, a functional 
form which can model this flattening is needed. One such 
function which we investigated is 
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FIG. 26. Fit \ 2 distribution from data (histogram), the ex- 
pected ti signal + background (filled circles) , and background 
alone (open triangles). 



F(x) = - In(Pi 



P 2 9(x-P 5> P 8 ) 

P 3 g(x-P 6 ,2P 8 ) 

P ig (x-P 7 ,4P s )), 



(7.9) 



where g is the Gaussian form g(x,a) — exp(— (x/a) 2 /2). 
We determine the parameters P\-P% by fitting this func- 
tion (using minuit) to the likelihood points over the en- 
tire range of 110-230 GeV/c 2 ; the results are plotted in 
Fig. If we extract from these curves the positions of 
the minima, the results are I73.6I5 5 GeV/c 2 for LB and 
172.4^2 GeV/c 2 for NN (taking the error from where 
the curve rises by 0.5). From this, we conclude that the 
procedure of fitting a quadratic in the central region does 
not seriously underestimate the width. In addition, in 
Monte Carlo studies, F(x) did not perform better on av- 
erage than the simple quadratic fit; thus, we do not use 
F(x) for the final mass extraction. 

We have explored some additional variations in the 
definition of the likelihood function. The algorithm of 
HMCMLL [|o) starts with the same likelihood as Eq. (O) , 
but eliminates the nuisance parameters a* and using a 
maximum likelihood estimate rather than integration. To 
be able to compare likelihoods from different Monte Carlo 
samples, though, we modify the likelihood following the 
prescription of Ref. || : 



L 



A]) 



(7.10) 



The results of this procedure are given in Table XIV. 
Alternatively, we can eliminate n s and by integrating 
over them, rather than by using a maximum likeli hood 
estimate. The results of this are also given in Table XIV. 
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TABLE XIII. Results of fits to the candidate sample, showing the top quark mass m t and the number of signal and 
background events n s and n&. The labels "quadiV" and "cubiV" denote iV-point quadratic and cubic fits, while "avg" denotes 
the mean value of the posterior mass probability distribution. "— lnL m i n " is the minimum — InL point; Xpoly is for the 
polynomial fit to the likelihood points. 



Binning 


— In L min 


Method 


m t 


n s 


rib 


Xpoly 








(GeV/c 2 ) 






LB 


23.1 


quad9 

nil Qn 1 I 

cub9 
cub 11 

avg 


174.0tr 6 
1 7 A q+ 7 - 5 

1(4. J_ 7 5 

173.71^ 
172.41^ 
175.4+^ 


oq Q+8.5 

23.8+7-g 

no 7+8.5 
' -9.2 
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FIG. 27. Fitted mass for all events which pass the precuts 
and the x 2 cut. Filled circles are a mixture of it signal and 
background and open triangles are the background only, both 
averaged between the results of the LB and NN analyses. 
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FIG. 28. Negative log likelihood for (a) LB and (b) NN 
analyses. The solid curve is a quadratic fit to the 9 points 
arou nd the minimum; the dashed curve is from fitting 
Eq. ((TJ) to all points in the range 110-230 GeV/c 2 . (c) Re- 
sults of the LB fit for events passing the LB selection. The his- 
togram is data, filled circles are a mixture of mt = 175 GeV/c 2 
it signal and background, normalized using the results of the 
LB fit, and open triangles are background only. 
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FIG. 29. Results of the LB fit for events failing the LB 
selection. The histogram is data, filled circles are a mixture of 
nit = 175 GeV/c 2 it signal and background, normalized using 
the results of the LB fit, and open triangles are background 
only. 



TABLE XIV. Additional fit results. 



Method 


Binning 


- In L min 


m t 
(GeV/c 2 ) 


n s 


rib 


HMCMLL 


LB 


22.7 


174.1+°;° 


23.6t™ 


53.4+JV 




NN 


73.1 


172.0+^1 


34.ot; 4 9 3 


42.6+^° 


Integration 


LB 

NN 


17.2 
68.5 


174.5±™ 
169.8j£! 


24.9t|;J 

on fj+8-4 
OU.O_ 10 2 


54.2±i 2 9 
48.5^ 



These variations do not have a large effect on the final 
result. 

To further test the stability of these results, we repeat 
the fits using samples in which one candidate event is re- 
moved, for a total of 77 distinct fits. For the LB case, the 
RMS of the resulting distribution of fits was 0.3 GeV/c 2 ; 
the smallest result seen was 173.0 GeV/c 2 , and the 
largest was 174.7 GeV/c 2 . For the NN case, the RMS 
was 0.5 GeV/c 2 , the smallest result was 170.1 GeV/c 2 , 
and the largest was 172.5 GeV/c 2 . 

To summarize the main results of this section, the LB 
analysis yields m* = 174.0 ± 5.6 GeV/c 2 , and the NN 
analysis yields m t — 171.3±6.0 GeV/c 2 . 



F. Tests with Monte Carlo samples 
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FIG. 30. Results of NN fit: (a) Data, (b) m t = 172 GeV/c 2 
it signal plus background, normalized using the results of the 
NN fit. 



We test the mass extraction procedure by performing 
fits to ensembles of Monte Carlo experiments of known 
composition. The size of the experiments is fixed; the 
number of background events in each is chosen from a 
binomial distribution with a fixed mean. 

For the first set of tests, the ensembles consist of 
1000 experiments with a composition of (n s ) — 26 and 
(rit,) = 52, for an experiment size of N — 78 events with a 
1:2 signal/background ratio. R esul ts for the LB and NN 
analyses are shown in Tables XV and XVI. For these 



tests, the tabulated mean value is from a Gaussian fit 
to the extracted mass distribution, and the width is the 
symmetric interval around the mean which contains 68% 
of the entries. (We estimate the statistical errors on these 
means and widths to be in the range 0.5-1.0 GeV/c 2 .) 
Note that the 9-point quadratic fit gives the slope closest 
to unity. Some results for ensembles containing signal 



only are given in Tables XVII and XVIII 



There are several competing factors which contribute 
to the mass dependence of the width of t he e nse mble m ass 
distributions a(m t ) observed in Tables XV and XVI. As 
mt increases, the widths of the mat distributions slowly 
increase. From this one would expect the a(rrit) to in- 
crease with increasing top quark mass. However, we rely 
on the difference between the signal and background mgt 
distributions to set the background normalization. This 
difference is smallest for m t around 140-150 GeV/c 2 ; 
thus, one would expect cr(m t ) to be larger in that re- 
gion. Finally, the spacing of the generated Monte Carlo 
points is finer in the region near 170 GeV/c 2 ; the avail- 
able statistics are also larger there. This permits a more 
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TABLE XV. 


Ensemble tests for the LB analysis with 1:2 signal/background, showing 


means 


and 68% widths. "Slope" is 


from a linear fit to the means. 


















Input 


quad9 




quad 11 






cub9 




cub 11 




Mass 


mean 


width 


mean 


width 


mean 


width 


mean 


width 


(GeV/c 2 ) 


(GeV/c 2 ) 




(GeV/c 2 ) 






(GeV/c 2 ) 




(GeV/c 2 ) 




150 


150.4 


10.7 


150.8 


11.1 


151.5 


10.3 


151.9 


10.9 


155 


155.2 


9.1 


155.3 


9.f 


I 


155.3 


9.0 


156.5 


8.4 


160 


160.7 


9.2 


160.9 


9.1 


160.9 


9.3 


161.4 


8.3 


162 


162.6 


8.5 


162.8 


8.5 


162.8 


9.0 


162.9 


8.3 


165 


165.1 


9.0 


165.3 


9.0 


165.2 


8.7 


165.3 


8.7 


168 


168.2 


9.3 


168.3 


9.3 


168.1 


9.0 


168.1 


9.0 


170 


168.9 


7.6 


169.0 


7.7 


169.2 


7.2 


169.1 


7.4 


172 


172.2 


7.4 


172.2 


7.8 


172.0 


7.4 


172.1 


7.5 


175 


174.9 


8.4 


174.9 


8.5 


174.9 


8.4 


174.7 


8.3 


178 


177.6 


8.5 


177.5 


8.5 


177.4 


8.0 


177.2 


8.0 


180 


179.7 


8.7 


179.6 


8.6 


179.4 


8.2 


179.2 


8.1 


182 


181.8 


8.1 


182.1 


8.2 


181.3 


7.8 


181.1 


7.5 


185 


183.9 


8.9 


183.9 


9.1 


183.3 


8.2 


183.2 


8.1 


190 


190.5 


9.7 


191.1 


10.0 


189.0 


9.0 


189.0 


8.9 


Slope 


0.98 




0.98 






0.94 




0.91 




TABLE XVI. Same as Table 


XV 


for the NN analysis. 


Input 


quad9 




quad 11 






cub9 




cubll 




Mass 


mean 


width 


mean 


width 


mean 


width 


mean 


width 


(GeV/c 2 ) 


(GeV/c 2 ) 




(GeV/c 2 ) 






(GeV/c 2 ) 




(GeV/c 2 ) 




150 


149.0 


9.8 


150.1 


10.8 


150.0 


8.9 


150.8 


9.9 


155 


154.6 


9.6 


154.6 


10.0 


155.1 


8.6 


155.5 


8.2 


160 


159.6 


9.5 


159.8 


9.7 


159.6 


9.4 


160.1 


8.7 


162 


161.8 


9.2 


162.1 


9.0 


161.9 


9.1 


162.3 


8.3 


165 


163.9 


9.2 


164.4 


9.4 


163.7 


9.2 


164.0 


8.6 


168 


167.2 


9.7 


167.6 


10.0 


166.9 


9.8 


167.0 


9.8 


170 


168.3 


8.8 


168.3 


8.2 


168.4 


8.0 


168.3 


8.0 


172 


171.6 


8.8 


171.5 


8.3 


171.7 


8.4 


171.7 


8.3 


175 


174.6 


9.3 


174.6 


9.1 


174.5 


9.0 


174.3 


9.0 


178 


176.6 


8.7 


176.6 


8.8 


176.6 


8.6 


176.6 


8.4 


180 


179.0 


9.0 


178.9 


8.9 


178.6 


8.7 


179.0 


8.5 


182 


181.1 


8.9 


180.9 


9.0 


180.8 


8.4 


180.9 


7.8 


185 


183.0 


8.9 


182.8 


9.1 


182.8 


8.6 


182.5 


8.4 


190 


189.0 


9.1 


189.0 


9.8 


188.4 


8.5 


188.2 


8.1 


Slope 


0.98 




0.96 






0.95 




0.93 






TABLE XVII. Ensemble tests for the LB analy 


sis with n s = 26 events and 


n b = 0. 




Input 


quad9 




quad 11 






cub9 




cubll 




Mass 


mean 


width 


mean 


width 


mean 


width 


mean 


width 


(GeV/c 2 ) 


(GeV/c 2 ) 




(GeV/c 2 ) 






(GeV/c 2 ) 




(GeV/c 2 ) 




168 


168.3 


6.7 


168.2 


6.7 


168.4 


6.3 


168.2 


6.5 


170 


168.9 


5.9 


168.9 


6.2 


169.1 


5.7 


168.9 


5.8 


172 


172.2 


6.2 


172.2 


6.0 


172.1 


5.9 


172.1 


5.9 


175 


175.6 


6.6 


175.7 


6.f 




175.5 


6.2 


175.5 


6.4 
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TABLE XVIII. Same as Table [XVII 



for the NN analysis. 



Input quad9 quadll cub9 cubll 

Mass mean width mean width mean width mean width 

(GeV/c 2 ) (GeV/c 2 ) (GeV/c 2 ) (GeV/c 2 ) (GeV/c 2 ) 

168 167.7 6.3 168.1 6.8 168.0 5.8 167.9 6.4 

170 168.9 6.1 169.0 6.0 169.0 5.6 168.8 5.7 

172 172.0 6.1 172.3 6.2 172.0 5.5 172.0 5.9 

175 175.6 6.5 175.6 6.7 175.2 6.0 175.3 6.4 



TABLE XIX. Results of mass fits to ensembles of Monte 
Carlo events. The ensembles consisted of 10,000 experiments 
of 77 events each, with the compositions indicated below. 



Input 










Mass 


(n a ) 


(n b ) 


Mean 


Width 


(GeV/c 2 ) 






(GeV/c 2 ) 


(GeV/c 2 ) 


LB 175 


23.8 


53.2 


175.0 


8.7 


NN 172 


28.8 


48.2 


171.6 


8.0 



accurate determination of the top quark mass in that 
region, leading to a smaller er(mt). 

Next, we try ensembles with compositions that match 
the re sults of the likelihood fit. The results are given in 
Table XIX. (These and all subsequent results use the 
"quad9" prescription.) Plots of the mass distributions 
from these ensembles are shown in Fig. 53. Also shown 



are the distributions of the pull quantity 
m t (measured) — m t (true) 



pull = 



a(m t ) 



(7.11) 



If the errors produced by the mass extraction procedure 
are correct, these distributions should have unit width, 
as is indeed observed. In addition, 70% of the la error 
intervals from the LB ensemble include 175 GeV/c 2 , and 
69% of those from the NN ensemble include 172 GeV/c 2 , 
as expected. 

The minimum — InL value for the LB fit was 23.1; 
for the NN fit, it was 74.5. (A smaller value of — InL 
corresponds to a better fit to the expected distributions.) 
This quantity is plotted for the LB and NN ensembles in 
Fig. |3^. A — In L value larger than that of the data is 
seen in about 7% of LB experiments and in about 28% 
of NN experiments. 

One can also look at the distribution of statistical er- 
rors from ensemble tests. For the data, the statistical 
error is 5.6 GeV/c 2 for the LB analysis, and 6.0 GeV/c 2 
for the NN analysis. Plots of the statistical error for the 
ensemble fits are shown in Fig. g3[ An error smaller than 
that for the data is seen in about 6% of LB experiments 
and in about 25% of NN experiments. The correlation 
between the mass and the error for the LB ensemble is 
exhibited in Fig. |34|. This shows that experiments with a 
small error typically yield masses closer to the true value. 

It is interesting to examine the ensemble results for 
that subset of experiments where the extracted statisti- 
cal error is similar to that actually obtained. We define 
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FIG. 31. Mass and pull distributions for 10,000 MC exper- 
iment ensembles with compositions matching the fit results. 
The dashed curves are Gaussian fits. For the mass distribu- 
tions, the width is the symmetric interval containing 68% of 
the entries; for the pull distributions, it is from the Gaussian 
fit. 
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FIG. 32. Minimum — InL distributions from the LB and 
NN ensembles. The arrows show the values corresponding to 
the data fits. 
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FIG. 35. Relative error (a(mt)/m t ) distributions from the 
LB and NN ensembles. The arrows show the value corre- 
sponding to the data fits, and the hatched regions show the 
definitions of the accurate subsets. 
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FIG. 34. Scatter plot of masses and statistical errors from 
the LB ensemble. The dashed lines of constant relative error 
delimit the "accurate subset" (see text). 



this "accurate subset" as follows. First, find the relative 
error (a(mt)/mt) for the result. For LB, this is 0.0322; 
for NN, it is 0.0350. Then convert these numbers to a 
percentile in the relative error distribution. These are 
6.0% and 24.9% for LB and NN, respectively. For any 
ensemble, we then define the accurate subset by look- 
ing at its relative error distribution and selecting those 
experiments which lie within a range of ±5% around the 
above percentiles. This is illustrated in Figs. [54] [35|. This 
procedure thus selects 10% of the total sample. (The rel- 
ative error is used because the statistical error tends to 
increase slightly with increasing mass; therefore, cutting 
on relative rather than absolute error results in a less 
biased subsample.) 

There is an additional complication which arises when 
a cut is made on the statistical error. The spacing of the 
generated mass points is finer around m t — 175 GeV/c 2 . 
This permits a more accurate determination of the top 
quark mass in that range. However, this implies that 
if a small error is required, the masses of the selected 
events will be biased towards the region with finer spac- 
ing. (Note, however, that as long as a cut on the error 
is not made, the uneven MC spacing does not bias the 
mass. Studies of an even but coarser MC spacing show 
that adding extra points reduces the statistical error in 
the region where the extra points are added, but does 
not, on average, shift the extracted mass distribution.) 
Thus, for the accurate subset fits we changed the pro- 
cedure slightly, adding Monte Carlo points at intervals 
of 2.5 GeV/c 2 between 130 and 160 GeV/c 2 and also 
between 185 and 210 GeV/c 2 . These additional mass 
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TABLE XXI. Values of correlation parameter p. 
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FIG. 36. Mass distributions for accurate subsets of ensem- 
bles. The dashed curves are Gaussian fits. 

TABLE XX. Comparisons of LB and NN ensembles for 
m t — 175 GeV/c 2 and a 1:2 signal/background ratio. The 
first line is the mean difference between the results; the sec- 
ond and third lines give the fraction of experiments for which 
the difference exceeds the observed difference of 2.7 GeV/c 2 . 
(Numbers are in GeV/c 2 .) 







Full 


LB 


NN 






ensemble 


acc. subset 


acc. subset 


(LB 


-NN) 


0.78 ± 0.05 


0.34 ± 0.06 


0.51 ±0.09 


(LB 


-NN) > 2.7 


29% 


11% 


18% 


|LB 


— NN| > 2.7 


45% 


16% 


28% 



K 


Full 


LB 


NN 




Sample 


acc. subset 


acc. subset 


100 


0.62 


0.89 


0.77 


5 


0.65 


0.89 


0.88 


4 


0.67 


0.89 


0.89 


3 


0.70 


0.89 


0.89 


2 


0.77 


0.87 


0.88 


l 


0.75 


0.67 


0.78 




the mean. 


Specifically, make 


the additional cut 




that 








TO L B 


- (tolb) < Ka LB 


and (7.13) 




|m N N 


- (»inn} 1 < Kavm 





• Replot Tolb and ttinn with this additional cut, and 
record the new means and RMS widths ((to-lb)', 

°LBi ( m NN) , On N ). 

• Plot (with all cuts) the distribution of 

(w LB - (tolb)') • (to nn - (m NN )') . (7.14) 



points were constructed by interpolating between the ex- 
isting MC histograms on either side. The results of these 
fits with the accurate subset cuts are shown in Fig. |3(]. 
The widths are 4.6 GeV/c 2 and 6.0 GeV/c 2 for LB and 
NN, respectively. This is a further indication that the 
error estimates from the likelihood fit are reliable. 

The results of the LB and NN analyses can be com- 
pared experiment-by-experiment, provided that the en- 
semble definitions are the same. We use the same en- 
semble definition as for the first set of tests (N = 78 
events and a 1:2 signal/background ratio) with m t — 
175 GeV /c 2 . The results for 10,000 experiments are given 
in Table |XX| . It is seen that given the observed statisti- 
cal errors, a difference between the two analyses of the 
magnitude seen is expected ~ 20% of the time. 

It is also interesting to look at the correlation between 
the LB and NN measurements. This can be defined using 
the ensemble mass distributions of tolb and Tonn as 



P = 



((tolb - (tolb)) (to nn - (m NN ))) 



CLBCNN 



(7.12) 



This is appropriate for Gaussian distributions; however, 
our distributions typically have a small number of non- 
Gaussian outliers. To explore the sensitivity of this quan- 
tity to these outliers, the following procedure is used. 

• For the cuts of interest, plot tolb and tonn- Record 
the means and RMS widths of these distributions 
((tolb), ctlb, (to nn ), ctnn)- 

• Reject experiments which are more than Ka from 



• Find the mean of this distribution, p is then calcu- 
lated by dividing this mean by c LB cr^ N . 

The results are tabulated for the full s ample and for 
the LB and NN accurate subsets in Table XXI. This is 
done using the same nit = 175 GeV/c 2 ensembles as for 
the previous comparisons. They do not depend strongly 
on K within reasonable ranges. To get a single number, 
we average the K = 5 results for the two accurate subset 
results, giving 0.88. This appears to be a reasonable 
representation of the accurate subset numbers (within a 
few percent) for K > 2. Propagating statistical errors 
through this calculation gives p = 0.88 ± 0.04. 

In summary, these ensemble tests show that the masses 
and errors obtained from the likelihood fit are reliable, 
and that our observed data set is not particularly un- 
likely. 



G. Systematic errors 

1. Energy scale errors 

The first major component of the systematic error is 
the jet energy scale uncertainty. What is relevant here 
is the uncertainty in the relative scale between the data 
and MC, rather than in the absolute scale. This was esti- 



mated to be ±(2.5% + 0.5 GeV) for each jet (see Sec. |TV|). 

We propagate this per-jet error to the final mass mea- 
surement by performing ensemble tests with all the jets 
in the events comprising the ensemble scaled up or down 
by the per-jet uncertainty. For these tests, we used large 
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TABLE XXII. Ensemble means for determining error due 
to jet energy scale. Each experiment consisted of N = 1000 
eve nts; t he signal/background ratios are the same as in Ta- 
ble Ixlx. 



TABLE XXIII. Ensemble means for determining the differ- 
ence between isajet and herwig. (All numbers in GeV/c 2 .) 
Each ensemble consisted of N = 1000 event experiments with 
a 1:2 signal/background ratio. 





LB 


NN 






LB 






NN 




Input mass 


175.0 GeV/c^ 


172.0 GeV/c 2 


m t 


HERWIG 


ISAJET 


Diff 


HERWIG 


ISAJET 


Diff 


Input (n s ) 


309.1 events 


374.0 events 


150 


150.5 


151.7 


-1.2 


149.4 


150.4 


-1.0 


-2.5% - 0.5 

Nominal 
+2.5% + 0.5 


170.9 GeV/c 2 
175.4 GeV/c 2 
179.4 GeV/c 2 


167.6 GeV/c 2 
171.3 GeV/c 2 
175.2 GeV/c 2 


160 
170 
180 


161.0 
169.3 
180.1 


160.9 
170.8 
180.1 


0.1 

-1.5 
0.0 


159.8 
168.3 
179.6 


159.4 
169.0 
178.9 


0.4 
-0.7 
0.7 


Symmetric 
Error 


4.2 GeV/c 2 


3.8 GeV/c 2 


190 
200 


190.2 
201.9 


190.1 
200.9 


0.1 
1.0 


189.0 
200.5 


188.8 
197.6 


0.2 
2.9 



experime nt size s, with N = 1000. The results are given 
in Table XXII and give an error of about ±4 GeV/c 2 . 
Comparing this with the shifts in the mgt distributions 
seen after scaling the jets (Fig. ^), we estimate the ratio 
between a shift in the final extracted mass and a shift in 
TOfit to be about 1.1. 

The systematic uncertainty in the electromagnetic en- 
ergy scale is much smaller than that of the jets, and can 
be neglected. The systematic uncertainty of the muon 
momentum measurement is estimated to be 2.5%. The 
effect of this uncertainty is found to be negligible relative 
to the jet scale uncertainty. 



2. Generator dependencies 

The next component of the systematic error is that 
due to uncertainties in how well the underlying Monte 
Carlo event generators model reality. We separate this 
into signal and background components. Of particular 
concern is the modeling of QCD radiation by the tt signal 
Monte Carlo. 

To estimate the error due to the herwig generator, 
we characterize herwig events using variables which are 
sensitive to the amount of initial and final state radi- 
ation (ISR and FSR) in each event. To do this, we 
match the direction of reconstructed jets with herwig 
partons and use the Monte Carlo parentage information 
to identify the jets which come from the b quarks and the 
hadronically-decaying W boson. We consider the four 
jets with highest Et ji, ■ ■ ■ j4, and define the variables: 

• x = Number of jets in j\, . . . j'4 which do not come 
from a b quark or the W boson (i.e., jets which are 
likely to be due to ISR). 

• y = Nj — 4 = Number of extra jets of any kind in 



the event (Nj ■ 
and \r]\ < 2.0). 



number of jets with Et > 15 GeV 



• z = Number of non-ISR jets in ji, . . . j'4 which have 
the same parent as a higher Et jet (i.e., the number 
of extra jets due to FSR among the top four). 



We take a herwig Monte Carlo sample (with m t = 
170 GeV/c 2 ) and bin it using these variables into a 
three-dimensional histogram with ranges < x, y, z < 2 
(27 bins). For each bin (x, y, z), we plot the fitted masses 
for all events in that bin, fit them to a Gaussian to form 
(wfit) (x, y, z), and then fit the resulting values to the 
empirical function 



G(x, y, z) = too + ux + v max (0, y - 



z) + wz, 



(7.15) 



for fit parameters mo, u, v, and w. Here, u describes 
the dependence of (mm) on ISR and v and w describe its 
dependence on FSR. In particular, the v term describes 
the dependence of the mass on the number of extra jets 
which cannot be attributed to either an ISR or FSR jet 
displacing another jet out of the top four. Additional 
low Et jets affect the mass only if they are FSR; thus 
we group v with w. We compute a population-weighted 
average of G over all bins; this is seen to agree well with 
(mfit) from the entire sample. Finally, we recalculate 
this average with (a) u (ISR) increased by 50% and (b) 
v and w (FSR) increased together by 50%. This gives 
excursions of 0.69 and 1.74 GeV/c 2 , respectively. Adding 
these in quadrature yields an error of 1.9 GeV/c 2 . (Monte 
Carlo studies of ensembles constructed of events from 
individual (x, y, z) bins confirm that, for these variations, 
the mass resulting from the likelihood fit approximately 
tracks (mat)-) 

We have performed several additional cross checks to 
verify that this is a reasonable estimate of the signal gen- 
erator error. The first is simply to compare these results 
to those from a different event generator, in this case 
ISAJET. We constructed ensembles from isajet events 
and analyzed them using the MC histograms derived 
from herwig. These are co mpared to ensembles of HER- 
WIG events in Table XXIII. Taking the six differences in 
the region 160-180 GeV/c? gives a mean of -0.17 GeV/c 2 
and a RMS of 0.8 GeV/c 2 . 

We also vary the QCD coupling strength parameter, 
Aqcd, of the herwig it Monte Carlo. The default value 
of this parameter in herwig 5.7 is 0.18 GeV; the cur- 
rent experimental value from the Particle Data Group is 
0.2lto;o3 GeV [||. Accordingly, we generate additional it 
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Monte Carlo with A QC d set to 0.15, 0.21, and 0.25 GeV, 
with m t = 170 and 175 GeV/c 2 Q. We then construct 
ensembles from these samples and process them us ing; the 
standard analysis. The results are given in Table XXIV . 
The size of the resulting deviations is on the order of 
1 GcV/c 2 ; they appear to be dominated by Monte Carlo 
statistics. 

We can make another comparison by using a version of 
herwig 5.8 in which final state radiation (FSR) in top 
quark decays is substantially suppressed. We compare 
results from ensembles made from this version to those 
from herwig 5.8 with normal radiation. The results are 



shown in Table XXV. Averaging over LB and NN, this 
is seen to give an excursion of about 2.15 GeV/c 2 . Note 
that the mm distribution with FSR suppressed is signif- 
icantly narrower on the low mass side than distributions 
with normal radiation. This difference in shape is why 
the relation between means of mm and ensemble results 
is different here than described above. 

The results of these cross checks confirm that our esti- 
mate for the systematic error due to the signal generator 
of 1.9 GeV/c 2 is reasonable. 

We also study the effects of varying the VECBOS back- 
ground model. Besides the sample used for the mass mea- 
surement (which uses a Q 2 scale of (p^) 2 an d herwig 
fragmentation), we have samples with a Q 2 scale of M 2 ^ 
and with isajet fragmentation. Results fro m ense mbles 
made from these samples are shown in Table XXVI . (The 
ensemble compositions were the same as for the jet en- 
ergy scale tests.) The largest difference seen is about 
2.5 GeV/c 2 using the Mj|r scale with herwig fragmen- 
tation. 

A concern is that the systematic error assigned here 
to VECBOS may not adequately reflect the level of agree- 
ment between VECBOS and data for r) W in the forward 
region (Fig. ||). To check this, we reweight the VECBOS 
events using a smooth function of i] W (a Gaussian) cho- 
sen to optimize the agreement between the simulation 
and the data. When we redo the mass extraction with 
this reweighted background, the top quark mass shifts 
by only 0.4-0.5 GeV/c 2 , a value much smaller than the 
error we attribute to VECBOS. This error can therefore 
be neglected. 

We also do the fits with the fraction of QCD multijets 
contributing to the background histogram [(22±5)%] var- 
ied within its errors. The changes to the final extracted 
mass are < 0.2 GeV/c 2 , well below the assigned error. 



3. Noise and multiple interactions 

At the luminosities at which most of our data were 
collected, it is likely that during a single beam cross- 
ing, there will be multiple pp inelastic interactions (MI). 
(This is expected about 2/3 of the time.) While these 
extra interactions rarely give rise to additional high-p^ 
objects, they do deposit a small amount of additional en- 



ergy over the entire calorimeter, affecting the jet energy 
calibration. Additional noise in the calorimeter is pro- 
duced by the radioactive decay of the uranium absorber. 
The Monte Carlo samples used for this analysis do not in- 
clude these effects. To estimate them, we generate a small 
number of additional Monte Carlo events which include 
noise, and which are overlaid with one or two additional 
interactions. The means of the m$x distribution for these 



samples are given in Table XXVII . Based on the luminos- 
ity profile of the collected data, we estimate that in order 
to represent the data, these samples should be combined 
in the ratio 0.31 : 0.33 : 0.36. The weighted average of 
the three means is then 170.5±0.6 GeV/c 2 ; the shift from 
the zero additional interaction case is 1.2 ± 0.7 GeV/c 2 . 
Scaling this by the factor 1.1 for the ratio between a shift 



in final extracted mass and a shift in mm (Sec. VII G 1 



gives an estimated shift due to noise and multiple in- 
teractions of 1.3 ± 0.8 GeV/c 2 . Since this effect is rela- 
tively poorly known and is small compared to other error 
sources, we do not attempt to correct the result for this 
effect, but instead include it as a systematic error. 



4- Monte Carlo statistics 

We assess the effect of Monte Carlo statistics on the 
final result by performing the fit to the data many times, 
each time smearing the MC histograms used to calculate 
the likelihood according to Poisson statistics. This is 
done separately for signal and background. The 68% 
width s of the resulting mass distributions are given in 
Table IXXVIII. 



5. Systematic error summary 



Table [XXIX| gives a summary of the systematic errors. 
In addition to the errors already discussed, the mean dif- 
ference of 0.8 GeV/ c 2 b etween the LB and NN ensemble 
results from Table XX has been added as a systematic 



uncertainty, and an additional error of 1 GeV/ c 2 has been 
added to cover possible small biases in the likelihood fit- 
ting method (this is approximately the RMS spread of the 
different polynomial fits in Table XIII). Note that these 



two components are of the same order as the estimated 
error due to Monte Carlo statistics, and that these small 
biases are probably due in large part to statistical fluc- 
tuations in the Monte Carlo histograms. Nevertheless, 
we retain these as separate components of the system- 
atic error in lieu of exploring this further with still larger 
Monte Carlo samples. 

The total systematic errors here are slightly smaller 
than those reported in Ref. ||. The signal generator 
error was 3.3 GeV/c 2 , taken from the difference between 
herwig and an older version of isajet, and the LB/NN 
difference was 1.35 GeV/c 2 , taken from half the difference 
of the fit results. 
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TABLE XXIV. Ensemble tests with Aqcd varied. Ensembles consisted of experiments with N = 1000 events and a 1:2 
signal/background ratio. 



Aqcd 


(mat) (GcV/c 2 ; 




LB (GcV/c 2 ) 




NN (GeV/c 2 ) 




(GeV) 


m t = 170 GcV/c^ 


m t = 


= 175 GcV/c" 


m t - 170 GcV/c^ m t 


= 175 GcV/c" 


rn t = 170 GcV/c z m t 


= 175 GcV/c z 


0.15 


171.0 




173.5 


170.5 


175.2 


169.5 


174.8 


0.18 


168.8 




173.1 


169.2 


175.3 


168.3 


174.5 


0.21 


170.8 




173.6 


170.2 


174.5 


169.5 


173.3 


0.25 


168.7 




173.2 


168.3 


175.7 


167.2 


175.0 



TABLE XXV. Comparison of ensembles constructed using 
herwig 5.8 both with and without FSR suppressed. The 
ensembles consist of N = 77 event experiments. For the LB 
case, (n a ) = 23.8, and for NN, (n s ) = 28.8. For both cases, 
m t = 170 GeV/c 2 . 





(mat) 
(GeV/c 2 ) 


LB 


NN 




(GeV/c 2 ) 


(GeV/c 2 ) 


FSR suppressed 


176.0 


172.2 


172.7 


Normal FSR 


170.1 


170.7 


169.9 


Difference 


5.9 


1.5 


2.8 



TABLE XXIX. Systematic error summary. 





LB 


NN 


Average 




(GeV/c 2 ) 


(GeV/c 2 ) 


(GeV/c 2 ) 


Jet energy scale 


4.2 


3.8 


4.0 


Generator 








tt signal 


1.9 


1.9 


1.9 


VECBOS flavors 


2.5 


2.5 


2.5 


Noise/Ml 


1.3 


1.3 


1.3 


Monte Carlo stat. 


0.6 


1.1 


0.85 


LB/NN diff 


0.8 


0.8 


0.8 


Likelihood fit 


1.0 


1.0 


1.0 


Total 


5.6 


5.4 


5.5 



TABLE XXVI. Ensemble means for determining VECBOS 
differences. Samples were generated with a VECBOS Q 2 scale 
of both Mw and (p'^ 1 ) 2 , and using both herwig (HW) and 
ISAJET (IS) for fragmentation. Each experiment consisted of 
N = 1000 events ; the signal/background ratios are the same 
as in Table |XIX. 



LB 


NN 


Input mass 175.0 GeV/c 2 
Input (n s ) 309.1 events 


172.0 GeV/c 2 
374.0 events 


(p't) 2 , HW 175.4 GeV/c 2 
Mw, HW 177.9 GeV/c 2 
(p^) 2 , IS 175.0 GeV/c 2 
Myyi IS 175.8 GeV/c 2 


171.3 GeV/c 2 
173.8 GeV/c 2 
171.2 GeV/c 2 
171.6 GeV/c 2 


Max. 

difference 2.5 GeV/c 2 


2.5 GeV/c 2 


TABLE XXVII. Means of mm distributions of tt Monte 
Carlo for multiple interaction error determination. (For the e 
+ jets channel, m t = 170 GeV/c 2 .) 


(mat) (GeV/c 2 


Weight 


additional interactions 169.3 ± 0.4 

1 additional interaction 170.5 ±1.3 

2 additional interactions 171.6 ± 1.2 


0.31 
0.33 
0.36 


TABLE XXVIII. Errors due to Monte Carlo statistics. 


LB 

(GeV/c 2 ) 


NN 
(GeV/c 2 ) 


Signal 0.49 
Background 0.33 


0.99 
0.57 


Total 0.6 


1.1 



H. Summary 

For the final mass result, we combine the results of 
these two analyses, taking into account their correlation p 
determined earlier. Let tolb and tonn be the two results 
and ctlb and ctnn be their errors. Then we form a x 2 as 
a function of the combined mass M: 



X 2 (M) 



1 



x[cr£ N (M - m LB ) : 
-2 ( oo- LB crNN(M 
±ct 2 b (M - m NN ) z 



m LB )(M-m NN ) (7.16) 



The combined result and its error is then defined by the 
minimum of this curve and the points where the curve 
rises by one unit from the minimum. (Monte Carlo stud- 
ies of this combination give a width of the pull distri- 
bution of 1.11 for the full sample, but 0.76 for the LB 
accurate subset and 0.97 for the NN accurate subset.) 



Inserting tolb = 174.0 GeV/c , ctlb = 
rn NN = 171.3 GeV/c 2 , cr NN = 6.0 GeV/c 2 
(for the accurate subsets) gives 

M= 173.3 ±5.6 GeV/c 2 . 



5.6 GeV/c 2 , 
and p = 0.88 

(7.17) 



The systematic errors of the two methods are averaged, 
giving a final result of 



m t 



173.3 ± 5.6(stat) ± 5.5(syst) GeV/c 2 . (7.18) 



VIII. PSEUDOLIKELIHOOD ANALYSIS 
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A. Introduction 



The pseudolikclihood (PL) analysis is an alternate 
method of extracting the top quark mass, with several 
important differences from the analyses of the previous 
section. It thus serves as a nearly independent check of 
the previous result. In this analysis, we kinematically 
fit candidate events at a series of fixed top quark masses 
mat (3C fits) over the range 100-250 GeV/c 2 . These 
fits are done using a different kinematic fitting program 
(SQUAW than was used in the previous section. In 
addition, when looping over jet permutations, we allow 
the assignment of jets beyond the fourth (in which case 
at least one of the top four jets is treated as ISR). At each 
mat , we choose the jet permutation yielding the smallest 
X 2 , and interpret the resulting plot of x 2 /2 versus mgt 
as defining a top quark mass "pseudolikclihood" L for a 
particular event given by 



L{m^ t ) 



-X 2 /2(m fit ) 



.1) 



We then sum this plot over all candidate events, sub- 
tract the expected background contribution, and fit the 
remainder to a quadratic function to extract the top 
quark mass. This analysis is performed mainly for signal- 
enriched subsamples of the entire precut sample. 

A major motivation for this analysis method is to more 
fully take into account the information from different jet 
permutations. For example, the fixed-mass \ 2 plot for 
one top quark candidate is shown in Fig. |37]. The infor- 
mation about both minima in this figure is incorporated 
directly into the PL analysis, but is not used in the LB 
and NN likelihood analyses. 



B. PL method 

Some examples of % 2 /2 plots for it events are shown in 
Fig. 38. These are "average x 2 /2" plots: for each mat , we 
average the \ 2 /2 over all events in the sample. The figure 
shows plots for events generated with both herwig and 
IS A JET for top quark masses from 160 to 190 GeV/c 2 . 
The plots from ISAJET are slightly wider than those from 
herwig. We will also need the background shape to 
subtract the expected background contribution from the 
data sample. It is determined by combining the average 
X 2 /2 plot of the VECBOS IF+jets sample with that of the 
QCD multijet sample. These plots are shown in Fig. |39. 
They are broader and have minima at about 150 GeV/cr, 
lower than those for it events (for m* > 160 GeV/c 2 ). 
The VECBOS sample uses the average jet transverse mo- 
mentum Q 2 scale and herwig for fragmentation, as in 
the variable-mass analyses. 

The next step is to determine the background normal- 
ization. The nominal background fraction in the precut 
event sample is found from the cross section analysis to 
be ~ 2/3. One can improve on this nominal background 
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FIG. 37. x 2 plot for squaw fixed-mass fits for event 58203, 
4980. 
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FIG. 38. Average x 2 /2 plots (after LB selection) for her- 
wig (filled circles) and isajet (open triangles) it events. 
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(a) W + jets 
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(b) QCD 
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FIG. 39. Average x 2 / 2 plots (after LB selection) for (a) 
VECBOS W +jets and (b) QCD multijet background samples. 

by using properties of the particular sample being an- 
alyzed which are sensitive to the background fraction. 
One such property is the average value of one of the top 
quark discriminants (either X>lb or 2?nn)- The back- 
ground fraction can be calculated as 

BG fraction = {V T - V D )/(V T - V B ), (8.2) 

where T> T is the average value expected for it events, T> B 
is that expected for background events, and V D is that 
of the sample being analyzed. 

We can do an analogous calculation using the x 2 /2 
plot. There is, however, a complication, due to the fact 
that the x 2 /2 plots depend on the top quark mass to 
a much greater extent than do the likelihood discrimi- 
nants. Therefore, to get a background from this method, 
we need a rough estimate of the top quark mass. We find 
this as follows. For each sample, we construct the average 
X 2 /2 plot. We compare the plot from data to that pre- 
dicted from MC signal plus background, with the MC top 
quark mass varied in 10 steps from 140 to 210 GeV/c 2 . 
We pick the mass which yields the smallest RMS differ- 
ence with the data. 

An additional complication is that, in general, the av- 
erage x 2 /2 plots for signal and background will cross at 
some TOgf We thus define the variable 

C= X 2 /2(m fit )- X 2 /2(™ fl t), 



mfit >m c 



m fit <m c 



(8.3) 



where m r 



is the point at which the plots cross. 



160 GeV/c 2 .) We then estimate the background in the 
same manner as before, using 



BG fraction = (C T - C D )/{C T - C B ), 



(8.4) 



where C T , C B , and C D are the values of C from MC 
signal, background, and the data sample, respectively. 

The background fraction for the full precut sample 
is taken to be the average of three values: the nomi- 
nal value, the value determined from the top quark dis- 
criminants, and the value from the x 2 /2 plot. They are 
weighted by the squared inverses of their errors. 

When analyzing subsets of the precut sample, we de- 
termine the nominal background for the subset by scal- 
ing down the background determined from the full pre- 
cut sample. The subset background fraction is then the 
weighted average of this nominal background fraction and 
the fraction estimated from the x 2 /2 plots. The back- 
ground estimate from the top quark discriminants is not 
used in this the subset selections tend to make 

the distributions of these discriminants similar for signal 
and background. The precut and LB subset background 
fractions determined from the data are 0.60 and 0.32, 
respectively. 

For each met, we subtract the x 2 /2 contribution ex- 
pected for the background from the total. This is eval- 
uated over the range 100-250 GeV/c 2 with a distance 
between points Amjt = 10 GeV/c 2 . We then extract 
the top quark mass and error using a quadratic fit near 
the minimum of this background-subtracted x 2 /2 plot. 
The extracted mass m mul is the value at which the fit 
function has its minimum, and its error is the deviation 
that corresponds to an increase of 0.5 units above the 
minimum. We try to use as many points as possible in 
the fit provided that the plot remains parabolic over the 
fit range. The algorithm used to select the fit range is 
determined empirically by fitting the average x 2 /2 plots 
for it Monte Carlo events. With Am m = 10 GeV/c 2 , at 
least three points below and two points above the mini- 
mum are required; thus, the mass range covered is at least 
50 GeV/c 2 . If necessary, we add points at the extremes 
until the value of x 2 /2 exceeds that at the minimum by 
an amount equal to the number of events in the plot. 
However, we add points on the high side only if the x 2 /2 
values change at an increasing rate, as expected for a 
parabola. We also do some fits with Amjt = 5 GeV/c 2 
over the range 100-255 GeV/c 2 . In that case, we use at 
least five points on each side of the minimum. 



C. Results of fits to Monte Carlo events 



(m cross is near 150 GeV/c for top quark masses above 



Table XXX contains results of fits to average x 2 /2 plots 
from MC samples. The mass m m ; n (from a quadratic fit 
near the minimum) for it Monte Carlo is slightly differ- 
ent from the MC input mass. It has a roughly linear 
dependence on the input top quark mass, with a slope 
that is only slightly smaller than that determined from 
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fits with the correct jet assignment. A linear fit to these 
points gives the following prescription for a "corrected" 
mass m corr : 



(m D 



27.0 GeV/c 2 )/0.815. 



(8.5) 



This relation is used to correct the masses m m ; n obtained 
from fits. 



D. Ensemble studies 

We study the performance of the PL method by form- 
ing ensembles of simulated experiments consisting of MC 
events which pass the precuts. These experiments con- 
tain N = 78 events each, with an average of 26 events 
from signal and the balanc e from background. The re- 
sults are shown in Table XXXI. (All use Am^t = 
10 GeV/c 2 . 



The typical errors on the average ensemble 
masses are about 0.5 GeV/c 2 , so the LB and NN sub set 
masses are consistent. We also show in Table XXXII re- 



sults for ensembles of experiments consisting of 26 signal 
events and no background. The agreement of the cor- 
resp onding average mass values between Tables XXXI 
and XXXII indicates that the background subtraction 
does not produce a mass bias. 

The widths of the m CO rr distributions for the subset 
analyses are smaller than those from the entire sample; 
further, the widths for LB subsets are all smaller than 
those for the corresponding NN subsets. The widths for 
the LB subset are smaller because the background for 
the LB subset is smaller than for the NN subset: at 
m t = 175 GeV/c 2 , the background fraction for LB is 
35%, and for NN, it is 42%. Results will therefore be 
based primarily on LB subset fits. The widths and shifts 
from the input mass are plotted in Figs. ^ and for 
the LB subset. 

Figure ^2] shows the pull distribution (as defined in 
Eq. (pi])) for LB subset fits. We find the error on m corr 
by dividing the width of the quadratic fit by the slope of 
the mass correction. A Gaussian fit to the pull distribu- 
tion for m t — 175 GeV/c 2 has a width of 1.51. Therefore, 
the corrected errors from quadratic fits typically under- 
estimate the width of the ensemble mass distribution and 
need to be scaled up by an additional factor of 1.51. 



E. Analysis of data sample 

We analyze the data for the two subsets defined by 
the LB and NN selections (see Sec. VI). These subset 
selections are about 80% efficient for the tt signal, versus 
about 30% for background. 

We select the data sample for analysis by requiring that 
each event have at least one fit with x 2 < 10- This yields 
a sample of 78 events, 32 of which pass the LB selection, 
and 33 of which pass the NN selection, with 27 events in 
common between these two subsets. (Due to differences 
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FIG. 40. 68% widths of ensemble mass distributions for 
different analyses. Squares are for PL fits to the LB subset, 
circles are for LB variable-mass fits, and plus symbols are 
for the NN variable-mass fits. Typical errors on the plotted 
values are between 0.5 and 1.0 GeV/c 2 . 
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FIG. 41. Same as Fig. ^ for mean ensemble mass devia- 
tions. 
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TABLE XXX. Results of fits to average \ 2 /2 plots from MC. m m i n is the minimum of a quadratic fit to the points, "width" 
is the width where the fit curve rises by 0.5, and (mat) is the weighted average of the mat values, where the weights are e~ x ' 2 . 
Entries labeled "jet high" and "jet low" are after scaling jet energies by ±(2.5% + 0.5 GeV). 
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FIG. 42. Pull distribution for LB subset fits to precut en- 
semble samples with m t = 175 GeV/c 2 . The curve is a Gaus- 
sian fit to the region —3 to +3. 
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FIG. 43. (a) x 2 / 2 plots for the LB subset of the PR sample. 
Data are the open squares, filled circles are the prediction for 
a mixture of background and 175 GeV/c 2 top events, and 
open triangles are the prediction for pure background. The 
solid line joins the filled circles, (b) Background-subtracted 
X 2 /2 plot for LB subsets. Data are the open squares, and 
filled circles are the prediction for 175 GeV/c 2 top events. 
The dashed curve is a parabola fit near the minimum. 
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TABLE XXXI. Ensembles with N = 78 and a 1:2 signal/background ratio. Entries labeled "jet high" and "jet low" are 
after scaling jet energies by ±(2.5% + 0.5 GeV). "Slope" is from a linear fit to the masses. The LB discriminant is used in the 
background determination for analyses of the precut samples. 



m m i n m CO rr Width containing 





avg. mass 


RMS 


avg. mass 


RMS 


68.27% 


95.45% 




(GeV/c 2 ) 


(GeV/c 2 ) 


(GeV/c 2 ) 


(GeV/c 2 ) 


(GeV/c 2 ) 


(GeV/c 2 ) 


Precut sample, herwig 














m t = 165 GeV/c 2 
m t = 170 GeV/c 2 
m t = 175 GeV/c 2 
m t = 180 GeV/c 2 


160.0 


8.5 


163.2 


10.4 


8.99 


22.13 


163.5 


8.4 


167.5 


10.3 


8.85 


21.88 


168.1 


8.4 


173.1 


10.4 


9.04 


20.98 


171.8 


9.5 


177.7 


11.7 


10.00 


22.77 


Slope 


0.80 




0.98 








LB subset, herwig 














m t = 150 GeV/c 2 
m t = 160 GeV/c 2 
m t = 165 GeV/c 2 
m t = 170 GeV/c 2 
m t = 175 GeV/c 2 


150.6 


7.3 


151.7 


8.9 


7.68 


16.84 


158.8 


7.4 


161.7 


9.0 


7.82 


18.07 


161.6 


7.1 


165.2 


8.7 


7.34 


17.27 


165.2 


7.0 


169.6 


8.6 


7.51 


17.22 


169.6 


6.7 


175.0 


8.2 


7.93 


16.83 


jet high 


172.6 


7.5 


178.7 


9.2 


8.22 


18.32 


jet low 


167.0 


8.0 


171.7 


9.9 


8.35 


19.73 


m t = 180 GeV/c 2 
mt = 190 GeV/c 2 


173.3 


7.5 


179.5 


9.2 


8.47 


18.28 


182.4 


7.7 


190.7 


9.5 


8.61 


19.54 


Slope 


0.78 




0.96 








LB subset, ISAJET 














m t = 160 GeV/c 2 
m t = 170 GeV/c 2 
m t = 180 GeV/c 2 
m t = 190 GeV/c 2 


158.6 


8.9 


161.5 


10.9 


9.23 


21.02 


166.0 


8.6 


170.5 


10.6 


9.59 


21.57 


173.0 


9.2 


179.1 


11.3 


10.38 


22.44 


180.6 


10.0 


188.5 


12.2 


11.38 


24.93 


Slope 


0.73 




0.90 








NN subset, herwig 














mt = 150 GeV/c 2 
m t = 160 GeV/c 2 
m t = 165 GeV/c 2 
mt = 170 GeV/c 2 
mt = 175 GeV/c 2 
m t = 180 GeV/c 2 
m t = 190 GeV/c 2 


149.4 


8.3 


150.2 


10.2 


8.55 


19.03 


158.1 


8.3 


160.8 


10.2 


8.75 


20.21 


161.1 


8.5 


164.6 


10.4 


8.44 


19.87 


164.8 


7.8 


169.1 


9.6 


8.41 


19.10 


169.5 


7.8 


174.8 


9.6 


8.45 


20.50 


173.3 


8.5 


179.5 


10.5 


9.53 


21.30 


182.4 


8.7 


190.6 


10.7 


9.67 


21.78 


Slope 


0.81 




1.00 









TABLE XXXII. Results of fits to LB subsets using ensembles with N = 26 and no background. Entries labeled "jet high" 
and "jet low" are after scaling jet energies by ±(2.5% ± 0.5 GeV). "Slope" is from a linear fit to the masses. 



m m i n m CO rr Width containing 



Input mass 
(GeV/c 2 ) 


avg. mass 
(GeV/c 2 ) 


RMS 
(GeV/c 2 ) 


avg. mass 
(GeV/c 2 ) 


RMS 

(GeV/c 2 ) 


68.27% 
(GeV/c 2 ) 


95.45% 
(GeV/c 2 ) 


150 


150.6 


5.0 


151.6 


6.1 


5.96 


12.07 


160 


158.6 


5.1 


161.5 


6.2 


6.02 


12.56 


165 


161.6 


4.7 


165.2 


5.8 


5.62 


12.18 


170 


165.2 


5.0 


169.5 


6.2 


6.15 


12.72 


175 


169.8 


5.0 


175.2 


6.2 


6.06 


12.51 


jet high 


172.6 


5.3 


178.7 


6.5 


6.41 


13.27 


jet low 


166.9 


5.5 


171.7 


6.7 


6.40 


13.78 


180 


173.5 


5.6 


179.8 


6.9 


6.95 


13.89 


190 


182.7 


5.8 


191.0 


7.1 


6.99 


14.40 


200 


191.0 


6.6 


201.3 


8.0 


7.88 


16.09 


Slope 


0.81 




1.00 
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TABLE XXXIII. Fits to data samples. 



Cut 


N 


Am flt 




777-corr 


BG fractions 






(GeV/c 2 ) 


(GeV/c 2 ) 


(GeV/c 2 ) 


precut 


subset 


LB 


32 


10.0 


171.0 ±4.6 


176.7 ±8.4 


0.60 


U.Oi 


LB 


32 


5.0 


170.4 ±4.3 


176.0 ± 7.9 


0.60 


0.32 


NN 


33 


10.0 


164.3 ±5.5 


168.4 ± 10.1 


0.65 


0.41 


Subset common to both PL and variable-mass 




LB 


31 


10.0 


169.0 ±4.6 


174.3 ±8.5 


0.56 


0.29 


LB 


31 


5.0 


169.8 ±4.4 


175.2 ± 8.0 


0.56 


0.29 


NN 


32 


10.0 


163.0 ±5.4 


166.8 ±9.9 


0.60 


0.38 



in the kinematic fitting, three events in the variable-mass 
analysis fail the x 2 cut for 3C SQUAW fits, and four events 
not in the variable-mass analysis are included in the PL 
analys is.) Res ults of fits to these samples are given in 
Table XXXIII. They are listed for Amjt values of both 
5 and 10 GeV/c 2 . A 5 GeV/c 2 increment gives slightly 
smaller errors. The x 2 /2 plot for the LB subsample is 



plotted in Fig. |43j. 

The top quark mass from the NN subset is smaller 
than that from the LB subset, and has a larger error. 
This is due to the fact that the events accepted by the 
NN selection but rejected by the LB selection tend to be 
of lower mass than those accepted by LB but rejected by 
NN. These low mass events are typically rejected from 
the LB subsample by the Ht2 > 90 GeV cut. 

If we look at the subset of events selected by both the 
PL and variable-mass analysis, there are 74 events, with 
31 events passing the LB selection and 32 events passing 
the NN selection. Results of fits to these samples are also 



given in Table XXXIII. 



F. Systematic errors 

This section gives estimates of the systematic errors for 
the PL analysis. The uncertainty in the jet energy scale 
is ±(2.5% + 0.5 GeV) per jet (Sec. |Tv|). To_estimate the 
effect of this on m corr , we redo the fits for a it MC sample 
with all jets scaled up or down by this uncertainty. The 
results are given in Table XXX . After applying the slope 



correction, this yields an estimate of ±3.6 GeV/c 2 . Note 
that this is only valid in the limit of a large number of it 
events with negligible background. We can also estimate 
this error by constructing ensembles with all the jets in 
the it signal s ample scaled up or down. The results are 
given in Table XXXI; the estimated error is ±3.5 GeV/c 2 . 
The same value for this error would be obtained using the 
mass shifts from ensemb le studies with no background, 
as given in Table |XXXII . 

The differences seen in m m i n between h erwig events 
and isajet events are shown in Table [XXX| The 
corresponding differences in m corr vary from —1.6 to 
2.6 GeV/c 2 over the range m t = 160-200 GeV/c 2 , and 
have a minimum between 170 and 180 GeV/c 2 . We then 
construct ensembles using isajet events and compare 



the se resu lts to those from herwig. This is done in Ta- 
ble XXXI . The resulting difference varies from —0.9 to 
2.2 GeV/c 2 over the range m t = 160-190 GeV/c 2 , so we 
assign a systematic error of 2.2 GeV/c 2 for the signal 
model. 

We estimate the contribution to the systematic error 
due to the choice of the VECBOS Q 2 scale and fragmenta- 
tion method by examining the four different choices listed 
in Table XXX. One can see that our choice of average jet 
Pt scale and herwig fragmentation represents an inter- 
mediate case. The resulting uncertainty in m t is obtained 
by constructing ensembles from the different VECBOS pa- 
rameter choices (but still using the favored choice for 
background calculation and subtraction). For ensemble 
samples with mt = 175 GeV/c 2 events, the average cor- 
rected masses for the four choices range from 174.5 to 
176.4 GeV/c 2 , for a maximum difference of 1.9 GeV/c 2 . 

Some of the other systematic error contributio ns cval - 
uated for the LB and NN analyses (see Table XXIX ) 
cannot be determined in the same way for the PL anal- 
ysis. The noise and multiple interaction error is deter- 
mined from the shift in the mean fitted mass for the 
variable-mass fits, which are not used in the PL anal- 
ysis. However, the kinematic fitters used give similar 
results, so the size of this effect for the PL analysis 
should be similar to that from the LB and NN variable- 
mass analyses. The error due to Monte Carlo statis- 
tics is assumed to be negligible. The LB-NN difference 
can be cal culated from the PL ensemble results in Ta- 
ble [XXX|. For the 170-180 GeV/c 2 mass range, the 



mean LB-NN difference is 0.23 GeV/c 2 . Finally, the like- 
lihood fit error contribution can be calcula ted from the 
four LB fit values given in Table pCXXIIl The RMS 



of the four LB corrected mass values is 0.9 GeV/c 2 . 
Combining in quadrature these error contributions with 
those for the energy scale (3.5 GeV/c 2 ), signal generator 
(2.2 GeV/c 2 from the maximum herwig-isajet differ- 
ence in the 160-190 GeV/c 2 mass range), and VECBOS 
flavors (1.9 GeV/c 2 ) gives a total PL systematic error of 
4.8 GeV/c 2 . 



G. Summary 

Pseudolikclihood analysis of the LB subset of the 
data gives a top quark mass of 176.0 ± 7.9 (stat) ± 
4.8 (syst) GeV/c 2 . This is based upon a 14-point 
quadratic fit (with a mass increment of 5 GeV/c 2 ) to the 
background-subtracted x 2 /2 plot over the range mat = 
140-205 GeV/c 2 . 



IX. FURTHER KINEMATIC STUDIES 

This section presents distributions of additional kine- 
matic quantities derived from the data. In these plots, 
the data sample is compared to a mixture of tt (generated 
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FIG. 44. Number of jets in each event with Et > 15 GeV 
and \r/\ < 2 for (a) herwig (m t = 170 GeV/c 2 ) and (b) isajet 
(mt = 170 GeV/c 2 ). The histogram is data, open triangles 
are expected background, and filled circles are expected signal 
plus background. 



with herwig with m t — 175 GeV/c 2 unless otherwise 
specified) and background models. The distributions are 
shown for the LB subsample and are normalized accord- 
ing to the results of the LB analysis. There are 18.5 signal 
events and 12.5 background events expected in this sub- 
sample. The error bars shown on these plots are from 
signal and background sample statistics only, and do not 
include the correlated error in the overall normalization. 

To test the compatibility of our predictions with the 
data, we use a Kolmogorov-Smirnov (K-S) test |p^| . The 
resulting probability is indicated on each plot. Note that 
binning the data induces an upwards bias in the K-S 
probabilities. To mitigate this effect, all such probabil- 
ities for distributions of continuous variables are calcu- 
lated using histograms consisting of 10,000 bins. 

Figure |4J shows the distribution of the number of jets 
in each event in the sample. For comparison, the predic- 
tion of isajet is shown as well as that of herwig. (Note 
that since the number of jets is unavoidably a discrete 
variable, the K-S probabilities are expected to be biased 
high.) Figure ^ shows the transverse mass of the lepton 
and neutrino. The slight rise of the prediction at low vn^ 
is due to the QCD multijet background. Figure ^ shows 
the total transverse momentum kr (vector sum) of all 
the objects used in the mass fit. (The full jet corrections 
are used; however, for this plot only, all untagged jets 
are corrected using the light quark corrections.) Note 
that due to the procedure of using only the top four jets 
for the fit, this is not necessarily the actual transverse 
momentum of the tt system (kr tends to be somewhat 
lower, on average). 

The remaining distributions depend on the results of 
the kinematic fit. For these, we plot the result corre- 
sponding to the jet permutation with the smallest x 2 . We 
also show the distributions which result if the data and 
Monte Carlo are refit with the additional constraint that 
m t — 173.3 GeV/c 2 . This is now a 3C fit. Note, however, 
that when making the % 2 cut to define the sample, the 
2C x 2 is used in all cases; thus, adding the additional 
constraint does not change the sample definition. The 
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FIG. 45. Transverse mass of the lepton and neutrino. The 
histogram is data, open triangles are expected background, 
and filled circles are expected signal plus background. 
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FIG. 46. Total transverse momentum kr of all objects used 
in the mass fit (the highest four jets, the lepton, and the $ T ). 
This is a vector sum. The histogram is data, open triangles 
are expected background, and filled circles are expected signal 
plus background. 
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FIG. 47. x 2 distributions from the 3C fit. The histogram is 
data (with two overflows), open triangles are expected back- 
ground, and filled circles are expected signal plus background. 
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FIG. 48. Invariant mass distribution of the tt pair. The 
histogram is data, open triangles are expected background, 
and filled circles are expected signal plus background, (a) 2C 
fit, (b) 3C fit with m t = 173.3 GeV/c 2 . 

distribution of the 3C fit \ 2 is shown in Fig. ^?]. There 
are five events with a 3C fit \ 2 > 10, compared to w 7 
expected. They are consistent with a mixture of back- 
ground and tt events where the wrong set of four jets was 
selected. 

Figure ^ shows the invariant mass of the it pair. Fig- 
ure shows the transverse momenta of the two top 
quarks, and Fig. [5^ shows their pseudorapidity. Fig- 
ures |5l] and [5^ show, respectively, the distance in 77 and 
(f> between the two top quarks. The mean of the 13 K-S 
probabilities we calculate from continuous distributions is 
(53 ±9)%, consistent with the hypothesis that our predic- 
tions for tt signal plus background adequately represent 
our data. 
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FIG. 49. Same as Fig. ^ for the transverse momenta of 
the top quarks (two entries per event). 




FIG. 50. Same as Fig. ^ for the pseudorapidities of the 
top quarks (two entries per event). 
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FIG. 51. Same as Fig. ^ for the difference in pseudorapid- 
ity r) between the two top quarks. 
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FIG. 52. Same as Fig. [IS] for the difference in azimuthal 
angle <j> between the two top quarks. 
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FIG. 53. Comparison of the measured top quark mass and 
production cross section with theoretical calculations [ [j3| , 

X. CONCLUSIONS 

In summary, we measure the top quark mass using lep- 
ton + jets events to be mt(lj) — 173.3 ± 5.6 (stat) ± 
5.5 (syst) GeV/c 2 . We have also measured the top 
quark mass from dilepton events yielding m t (ll) = 
168.4 ± 12.3 (stat) ± 3.6 (syst) GeV/c 2 . We combine 
these two values, assuming that the systematics for jet 
energy scale, multiple interactions, and it signal gener- 
ator dependencies are fully correlated, and that other 
systematics are uncorrelated. The result is 

ra t = 172.1 ± 5.2 (stat) ± 4.9 (syst) GeV/c 2 (10.1) 
= 172.1 ±7.1 GeV/c 2 . 

In a separate publication |6) , we describe the measure- 
ment of the pp — > tt production cross section. The result 
for m t = 172.1 GeV/c 2 is 

a(m t = 172.1 GeV/c 2 ) = 5.6 ± 1.8 pb. (10.2) 

Our results are plotted in Fig. |53| and are compared to 
several theoretical calculations of the it production cross 
section |53|. The agreement of the standard model ex- 
pectations with our measurement is excellent. We also 
find agreement between our data and predictions for dis- 
tributions of various kinematic variables for it decays. 

An alternate analysis technique using three constraint 
fits to fixed top quark masses using the lepton + jets 
data gives a result of m t (lj) = 176.0 ± 7.9 (stat) ± 
4.8 (syst) GeV/c 2 , consistent with the above result. 
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